home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Shareware Grab Bag
/
Shareware Grab Bag.iso
/
090
/
sampler5.arc
/
SAMPLER.DOC
Wrap
Text File
|
1985-10-12
|
215KB
|
5,797 lines
(tm)
The SORITEC Sampler
Version 1.06B-1.67
From
The Sorites Group, Inc.
PO Box 2939
Springfield, VA 22152
March 13, 1985
TABLE OF CONTENTS
Chapter 1 Introduction........................ 6
1.0 Introduction.......................... 6
1.1 What is SORITEC?...................... 6
1.2 SORITEC Sampler....................... 7
1.3 Getting Started....................... 8
1.4 Invoking SORITEC Sampler.............. 9
1.4.1 Interactive Processing.............. 9
1.4.2 Batch Processing.................... 10
1.5 Executing SAC Files................... 10
1.6 SORITEC Input Journal Files........... 11
Chapter 2 SORITEC Syntax...................... 12
2.0 Introduction.......................... 12
2.1 Variable Names........................ 12
2.2 Special Symbols....................... 12
2.3 Variable Types........................ 13
2.4 Selection of the Observation Set...... 15
2.4.1 Conditional Selection of the
Observation Period.............. 15
2.5 Transformations....................... 16
2.6 Revising Data in SORITEC.............. 18
2.7 Missing Data Handling................. 19
2.7.1 Missing Value Symbol Declaration.... 20
2.7.2 Missing Value Logical Function...... 20
2.7.3 Imputation of Missing Values........ 21
2.8 Wildcards............................. 21
2.9 Options............................... 22
2.10 Recovering Internal SORITEC
Variables........................... 22
2.11 SORITEC's Symbol Table................ 23
2.12 Minor Control Statements.............. 24
2.12.1 Specify Width of Output Device..... 24
2.12.2 Change Length of Input Line........ 24
2.12.3 Reset Maximum Error Limit.......... 25
2.12.4 Turn Batch Listing On or Off....... 25
2.12.5 Label Batch Output Pages........... 25
2
Chapter 3 Data Entry and Output............... 26
3.0 Introduction.......................... 26
3.1 SORITEC Alternate Load (SAL) Files.... 26
3.1.1 SAL File Input...................... 27
3.1.2 SAL File Output..................... 27
3.2 Data Interchange Format (DIF) Files... 28
3.2.1 DIF File Input...................... 28
3.2.2 DIF File Output..................... 30
3.3 Formatted Input and Output............ 31
3.3.1 FORTRAN Formatted Input............. 31
3.3.2 FORTRAN Formatted Output............ 32
3.4 Keyboard Entry........................ 33
3.5 Output of Data to the Terminal........ 34
3.5.1 Tabular Display..................... 34
3.5.2 Graphical Display................... 34
3.6 SORITEC DataBank Files................ 36
Chapter 4 SORITEC Databank (SDB) Files........ 37
4.0 Introduction.......................... 37
4.1 Create a Databank..................... 37
4.2 Access a Databank..................... 37
4.3 Release a Databank from SORITEC....... 38
4.4 Purge a Databank...................... 38
4.5 Retrieve Items from a Databank........ 38
4.6 Store Items in a Databank............. 39
4.7 Replace Items in a Databank........... 39
4.8 Rename Items in a Databank............ 39
4.9 Switch the Names of Two Items
in a Databank...................... 40
4.10 Discard Items from a Databank......... 40
4.11 Generate a Directory Listing
of a Databank...................... 40
Chapter 5 Programming Constructs.............. 41
5.0 Introduction.......................... 41
5.1 Numeric Looping....................... 41
5.2 Unconditional Branching............... 42
5.3 Conditional Branching................. 43
5.4 Null (Continuation) Statement......... 43
5.5 Alpha Looping......................... 43
3
Chapter 6 Dummy Data Series Generation and
Special Transformation Commands..... 45
6.0 Introduction......................... 45
6.1 Create a Time Trend Dummy Series..... 45
6.2 Create Seasonal Dummies.............. 45
6.3 Recode a Variable.................... 46
6.4 Conversion of Time-Series from
One Periodicity to Another........ 46
6.5 Maximum Function..................... 47
6.6 Minimum Function..................... 48
6.7 Modular Division..................... 48
6.8 Compute Moving Average............... 49
6.9 Compute Moving Sum................... 49
6.10 Statistical Operations............... 49
6.10.1 Correlation Matrix Calculation.... 49
6.10.2 Covariance Matrix Calculation..... 49
6.10.3 Other Statistical Operations...... 50
Chapter 7 SORITEC Financial Functions......... 51
7.0 Financial Functions in SORITEC........ 51
7.1 Internal Rate of Return............... 51
7.2 Present Value......................... 52
7.3 Loan Amortization..................... 53
Chapter 8 SORITEC Sampler Cross-Section
Techniques.......................... 55
8.0 Introduction.......................... 55
8.1 Synopsis.............................. 55
8.2 Crosstabulation Analysis.............. 56
Chapter 9 Estimation and Forecasting with
SORITEC Sampler..................... 57
9.0 Introduction.......................... 57
9.1 Ordinary Least Squares (OLS)
Estimation......................... 57
9.2 Autocorrelation Techniques for the
Single Equation Model.............. 58
9.2.1 Cochrane-Orcutt Iterative Technique. 58
9.2.2 Hildreth-Lu Scanning Technique...... 58
9.3 Two-Stage Least Squares (2SLS)
Estimation......................... 59
9.4 Forecasting Single Equation Models.... 59
4
Chapter 10 SORITEC Interactive Print Server... 62
10.0 Introduction......................... 62
10.1 Entering Tableau Mode................ 62
10.2 Tableau Descriptions................. 63
10.2.1 Coefficient Display............... 63
10.2.2 Regression Summary Table.......... 63
10.2.3 Residual Autocorrelation Summary.. 63
10.2.4 PDF and Histogram of
Standardized Residuals............ 63
10.2.5 Non-Parametric Residual
Distribution Tests................ 63
10.2.6 Regression ANOVA Table............ 64
10.2.7 Covariance Matrix of
Coefficient Estimates............. 64
10.2.8 Correlation Matrix of
Coefficient Estimates............. 64
10.2.9 Beta Coefficients, Elasticities
and Partial R..................... 64
10.2.10 Statistical Summary of
Exogenous Variables............... 64
10.2.11 Actual vs Fitted Plot and
Standardized Residuals............ 64
10.3 Interactive Crosstabs................ 65
APPENDIX I SORITEC INTERNAL SYSTEM NAMES.... 66
APPENDIX II GLOBAL OPTIONS AND DEFAULT
SETTINGS IN SORITEC........... 69
APPENDIX III QUICK REFERENCE LISTING OF
SORITEC Sampler COMMANDS...... 73
APPENDIX IV DETAILED FEATURE LIST FOR SORITEC
VERSION 1.06B................. 76
INDEX
SORITEC INFORMATION REQUEST FORM
5
Chapter 1
Introduction
1.0 Introduction
This econometrics package, called SORITEC Sampler, is provided to you
free of charge from the Sorites Group, Inc. (SGI) of Springfield, Virginia.
SGI is a software engineering firm which has been developing and supporting
a machine-independent econometric modeling package since 1978. Our
package, called SORITEC (SORITes EConometric) is now supported on 22
different mainframes, minicomputers and microcomputers and still has only
one reference manual. The program's command syntax is identical on all
machines.
In the spring of 1984, we made our first port to the IBM PC. Unlike
other econometric packages for microcomputers, the full version of SORITEC
for the PC is not a subset of the mainframe package. In order for the
program to operate, the full power of the IBM PC/XT, PC/AT or compatible
computer must be available. This means that your system must have a hard
disk and 512K of RAM. The 8087 math co-processor is required for the full
version of SORITEC.
The availability of advanced econometric and statistical techniques
including full information maximum likelihood (FIML), and non-linear simul-
taneous equations estimation and simulation for a fraction of the price of
similar capabilities on a mainframe has put us in the forefront of using
the full potential of the IBM PC and compatibles. Given the increasing
power and declining costs of micro-computers, our original belief in the
need for a machine-independent econometrics package has proved correct.
1.1 What is SORITEC?
SORITEC is a sophisticated econometric modeling and forecasting system
that allows you to estimate or solve and simulate almost any mathematical
model that you might specify. The program enables you to do econometric
time-series analysis within an easy-to-use command language syntax. Those
of you familiar with TSP will find SORITEC's command language similar.
SORITEC can handle models with hundreds of equations, either linear or
non-linear, in either a static or dynamic framework. Model systems can be
specified, built, rearranged, databanked and manipulated by name. Once a
model is constructed, it can be recalled and resimulated by a single com-
mand. SORITEC provides a report writer capable of providing detailed and
complex reports with minimal effort and training. SORITEC is also a com-
plete data processing language that lets you do varied and complex data
reduction operations easily. The combination of its econometric methods
and report writing capabilities permits SORITEC to handle most current
production reporting automatically via command files. SORITEC also con-
6
tains most of the useful statistical functions of the leading mainframe
statistics packages. As new versions of SORITEC are released, we expect it
will soon exceed the pure statistical capabilities of these packages.
Appendix IV of this document provides a complete list of features in
SORITEC.
A new version of SORITEC (Version 1.06B), available in February 1985,
incorporates significant enhancements to the system's analytical capabili-
ties and user friendliness, including multivariate techniques such as
PROBIT, CROSSTABS and ANOVA, the most complete set of regression diagnos-
tics available, and tableau-oriented regression output. Mainframe and
minicomputer versions of SORITEC are available to universities for teaching
purposes free of charge (except for a small processing fee). Contact SGI
or a local distributor for prices.
1.2 SORITEC Sampler
SORITEC Sampler is a subset of the full SORITEC and is equivalent to
econometric packages sold today for $200 to $400. It is supplied free of
charge, and may be reproduced and distributed freely as long as no fee is
charged and no alterations are made. The program requires 384K of random
access memory and can run off floppy diskettes. The 8087 math coprocessor
is recommended.
We decided to give SORITEC Sampler away for several reasons. First,
distributing a version of SORITEC which is useful, free and reproducible is
a cost-effective method of advertising this type of product. You are
encouraged to make as many copies as you wish and pass them on to friends
and colleagues. Second, we needed a demo copy to illustrate SORITEC's
command structure, data handling capabilities and techniques. Rather than
sending out a demo disk that simply went through some song and dance
without allowing you to really "touch" the package, we figured that a
"live", though limited, version of the real thing would be an excellent
demonstration of SORITEC's features. Lastly, SORITEC Sampler is based on
the belief that there is no justification for charging to estimate single
equation models.
The techniques described in this Reference Manual are those supported
by SORITEC Sampler. However, they function identically to those in
SORITEC. SORITEC Sampler provides a useful introductory econometrics
package that will encourage people to apply econometric and statistical
techniques. In return, we hope that you will consider us when you want
more econometric capability on your computer and will help spread the word
about SORITEC by passing this package around. We, in turn, will continue
to reinvest our revenues in product development instead of elaborate
advertising.
You can obtain the latest release of SORITEC Sampler plus a bound copy
of this Reference Manual and the full SORITEC Reference Manual by sending
7
(U.S.)$50.00 to SGI. Consult the form at the end of this document for
further details. Note that SORITEC Sampler is NOT a supported product and
is distributed without warranties of MERCHANTABILITY and FITNESS FOR A
PARTICULAR PURPOSE.
1.3 Getting Started
SORITEC Sampler is distributed on two diskettes; a third diskette
contains this documentation and examples. You can make a backup of the
diskettes for safekeeping or distribution by using the DOS COPY command.
Use COPY a:*.* b: to copy all files on the diskette to another diskette.
If your system has a hard disk, use COPY a:*.* to copy all files on the
diskette to your current directory.
SORITEC Sampler requires at least 384 KB of RAM and DOS 2.0 to
execute. An 8087 or 80287 math co-processor is optional, but recommended.
The program may be run either from floppy diskettes or from a directory on
a hard disk system.
Sampler has some minimum system requirements which may require you to
change your CONFIG.SYS file. The following commands must be included in
the CONFIG.SYS file before running the program.
DEVICE = ANSI.SYS
FILES = 12
BUFFERS = 12
BREAK = ON
To invoke SORITEC Sampler on systems without a hard disk, insert Disk
1 of 2 into the current drive and enter:
SAMPLER
followed by a carriage return. After a moment, you will be prompted to
insert Disk 2 of 2 into the current drive. Replace the first diskette with
the second and enter a carriage return. Once the SORITEC Sampler banner
appears on the screen, follow the instructions displayed there.
To invoke SORITEC Sampler on systems with a hard disk, you should
first copy the SAMPLER.EXE from the first distribution disk, plus all .OVL
files and the SAMPLER.FMT file from the second distribution diskette to a
directory or subdirectory on the disk. Invoke the program, as outlined
above, by entering
SAMPLER
and follow the instructions once the banner appears on the screen.
Use the DOS PATH command to identify the subdirectory in which your
SAMPLER.EXE, .OVL and .FMT files are stored if you want to invoke SORITEC
Sampler from any other directory or subdirectory on your hard disk. Sampler
will refer to its "home" subdirectory to load overlays, etc., but will look
for all input files and write all output files, including the input journal
8
file, to the current directory unless another directory is explicitly
specified in the SORITEC command.
SORITEC Sampler also supports DOS redirection of standard input and
output devices so that filename arguments may appear on the command line.
Any combination of:
SAMPLER [ < [d:][path]filename ]
[ > [d:][path]filename ]
[ >> [d:][path]filename ]
are legal arguments in the command line. Refer to the "Advanced DOS
Commands" chapter of your DOS manual for information about I/O redirection.
DOS redirection is particularly useful with SORITEC's batch processing
facility.
Do not invoke DOS redirection if you are running SORITEC Sampler on a
floppy disk system.
1.4 Invoking SORITEC Sampler
SORITEC Sampler executes in both interactive and batch modes of proce-
ssing. However, before describing how each mode is invoked, it is impor-
tant to distinguish SORITEC interactive and batch processing modes from the
foreground and background processing modes that are typically associated
with these terms. When SORITEC is in interactive mode, the program takes
each line of input and processes it as it is received. In batch processing
mode, on the other hand, SORITEC accepts input lines until they are
logically concluded with an END statement. At that point, batch job execu-
tion begins. Note that SORITEC interactive and batch modes can run in both
foreground and background processing environments.
Batch job processing in SORITEC has certain characteristics that
sometimes make it more convenient to use than interactive mode. First, it
compiles a complete listing of the commands of a job and outputs it without
line prompts to the output device before execution begins. This separates
the command lines from the output and generally makes the output more
presentable for reports, etc. Second, batch processing mode provides for
the labelling of the job and the insertion of titles into the output
listing. Batch processing mode is often useful when output is too wide to
be displayed legibly on the terminal. Through DOS redirection and respeci-
fication of the output width, output that would otherwise be difficult to
read on a terminal can be routed to other output devices, such as line
printers. Although most of these features can be replicated in interactive
mode, it is generally more convenient as a batch job.
1.4.1 Interactive Processing
SORITEC Sampler prompts you for input after the banner page has been
passed. Prompts in SORITEC are of the form 1-- ,2-- ,3-- and so on.
When the first prompt is returned, interactive processing is started by
entering:
HELLO
9
Sampler will respond by printing another banner with version information,
date and time, default settings for input (SCAN) and output (WIDTH), and
workspace size. After the second prompt has been displayed, you may enter
any legal SORITEC Sampler command.
Interactive processing is terminated by entering the command:
QUIT
Execution of QUIT closes and returns any files that are currently attached
and returns control to DOS. All items in the user's workspace are
irretrievably lost once the QUIT command is executed.
1.4.2 Batch Processing
SORITEC identifies batch job processing through the JOB command. The
JOB command consists of the command name and up to 120 characters of label
information, i.e.,
JOB job_label
The JOB command supplies unchangeable labelling information for the entire
batch run. As such, only one JOB command may appear in any single job
deck. The "job_label" may not contain the symbols ; , $ , or &.
Batch processing is terminated by the END command which is entered
simply as:
END
At the end of a JOB, the END statement instructs SORITEC to return and
close any databank or other file which is attached. The user's workspace
is irretrievably lost after the END statement is processed.
Note that the END command has several uses in SORITEC. It is required
at the end of SORITEC SAL files and to close DO loops and PROCEDURES. This
does not mean that you cannot embed a command within a batch job that uses
an END statement. SORITEC keeps track of END statements when compiling
batch job statements and senses the end of a JOB only when it is logically
compelled to do so. Descriptions of these other commands that use the END
command are provided later in this documentation.
1.5 Executing SAC Files
SORITEC accepts input from other than the terminal through a command
file known as a SORITEC Alternate Command, or SAC, file. A SAC file is
simply a DOS file that contains legal SORITEC commands. It may be struc-
tured as a batch job for SORITEC's batch processing facility or may simply
be a set of commands as you would enter them from the terminal. For
SORITEC to recognize it as a SAC file, the filename must have a .SAC
extension, i.e., the file must exist on your DOS directory as
"filename.SAC". It can be constructed using any commercially available
10
text processor.
SORITEC will execute command files at any point in an interactive
processing session. Command file processing is started by entering:
EXECUTE filename
where "filename" is the name of the command file you wish to have executed.
Do not enter the file extension with the filename on the EXECUTE command
line.
If the SAC file exists on a drive or directory other than the current
one, it must be referenced within single quotations, i.e.,
EXECUTE 'd:filename'
or
EXECUTE '\path\filename'
Command file output is always displayed on the terminal unless it has been
redirected via DOS redirection.
1.6 SORITEC Input Journal Files
Sampler will open an input journal file on the current directory,
called SORITEC.JNL, if interactive processing mode is invoked and the ON
JOURNAL option is enabled. This file stores all commands that are en-
tered during a session so that you can archive the command sequence for
future use. The file can later be executed as a SORITEC Alternate Command
file. Journal files are particularly important for reviewing an interac-
tive session for errors when results are not as expected. They can also be
edited and re-executed to produce a "final draft" of a particular statisti-
cal or estimation problem.
Any file that exists as SORITEC.JNL on the current directory is auto-
matically erased when a new journal file is written. Be sure to rename any
journal files you wish to keep. Remember that you must change the filename
extension to ".SAC" if you wish to EXECUTE it as a command file.
Do not enable the JOURNAL option if you are running SORITEC Sampler
off floppy disks as there is no room on Disk 2 for writing the file. If
you attempt to do this, Sampler will print an error message and re-prompt
you for a command. Unpredictable results could be obtained, however, in
subsequent operations.
11
Chapter 2
SORITEC Syntax
2.0 Introduction
SORITEC syntax has been constructed to make the entire package easy to
learn and use. Typical SORITEC operations can be divided into two types of
statements: commands and transformations. Before considering the command
structure or allowable transforms, we need to consider the form of a
SORITEC variable name.
The most important fact to keep in mind when you are using SORITEC
Sampler is that the language is "series" oriented rather than value
oriented. In FORTRAN or BASIC, the statement X=Y sets the value of a
VARIABLE X to the value of variable Y. In SORITEC, X = Y replaces the
entire time-series X with the time-series Y. So, in SORITEC, single para-
meter values are more the exception than the rule.
2.1 Variable Names
SORITEC variable names are composed of the characters A-Z, the numbers
0-9 and the symbols @, %, ^, _, or :. The name MUST begin with a character
and must be no more than 32 characters (or symbols) long. Mathematical
operators may not be used in variable names.
SORITEC allows dynamic leading or lagging of variables through sub-
scripted arguments; for example, GNP(1) and GNP(-1) are the first lead and
lag values of GNP. Arguments for lags or ranges may also be integer
constants or SORITEC variables. In commands that expect multiple argu-
ments, SORITEC will accept ranged values of leads and lags, e.g., GNP(+2 TO
-3) is automatically expanded to GNP(2) GNP(1) GNP GNP(-1) GNP(-2) GNP(-3).
Note that positive signing of lead arguments is optional.
2.2 Special Symbols
SORITEC defines several special symbols to provide a simplifying
shorthand in using the package. Currently, the symbols ;, !, ",", =, +, -,
*, /, ., >, <, (, ), ?, &, and the string ... have special meaning in
SORITEC.
; delimits each command when several commands are "stacked" on a
single line, e.g.,
USE 1984M1 1984M6 ; PRINT GNP
! identifies comments in SORITEC. If entered in column 1, any text
between the ! and the end of the line or line delimiter ";" is considered
12
to be a comment and is ignored by the interpreter. The ! symbol only
functions as a comment identifier if placed in column 1.
"," is used only as an argument separator and is interpreted as a
blank everywhere except in a format statement.
+ - * / . < > and = are reserved for math operations.
The parenthetical symbols, ( and ) , are reserved for designating
command modifiers, arguments and subscripts.
The symbols "*" and "?" are used as a wildcard references, described
later in this chapter.
Lastly, & and ... indicate that the current command continues onto the
next line.
2.3 Variable Types
There are seven types of variables in the SORITEC language. Time-
series variables are the default data type in SORITEC, so a reference to
variable X implicitly references the time-series X. Variable assignment
implicitly assumes the variable is time-series unless you state otherwise;
so, a simple statement such as x=2 creates a series of numbers all equal to
2, NOT a single value.
The second most common data types are parameters and constants. Both
are scalar values, but whereas parameter values can be changed by the SET
command, constants cannot. Parameters and constants are created by the
following statements.
PARAMETER param_1 [value_1] param_2 [value_2] ...
CONSTANT const_1 [value_1] const_2 [value_2] ...
If the value associated with a parameter or constant is omitted, SORITEC
sets it to zero. Parameters can be set or reset using any standard
transformation by prefixing the transformation with the SET command, for
example:
PARAMETER a .5 b .3
SET a = a**0.5 * log(b)
SORITEC also defines vector and matrix data types. These types are
created by using the VECTOR or MMAKE commands, respectively. To create a
vector, use the command:
VECTOR vector_name value_1 value_2 ...
For example, to create the vector BETA, you would type
VECTOR BETA .5 .2 .1 -.5
SORITEC keeps track of the length of the vector when it is created. Indi-
13
vidual elements of a vector can be manipulated like scalar values in
SORITEC commands using subscript notation. For example:
SET ZERO=BETA(1)+BETA(4)
would result in the value of the scalar ZERO to 0.0. Matrix data types are
not supported in SORITEC Sampler.
SORITEC also allows you to name and manipulate equations as a separate
data construct using the EQUATION command. The form of the command is:
EQUATION equation_name [equation]
Equations are structured exactly as they are in FORTRAN.
Equations can be stored in databases and can be computed by name, once
values have been assigned to their parameters and variables. Use the
COMPUTE command, which is of the form:
COMPUTE equation_name
to recompute values for the left-hand side variables.
Note that the primary use of equations in SORITEC is for forecasting
and for non-linear estimation. In SORITEC Sampler, you can only use equa-
tions for forecasting or recomputing values, but not for estimation.
The final data type is the GROUP. A GROUP is a namelist that speci-
fies a set of names for further processing. The namelist is initialized by
the GROUP command, which has the form:
GROUP group_name name_1 name_2 ... name_k
To extract the elements of a namelist, group expansion must first be
enabled via the ON GROUP command. The group name is then replaced by the
individual names in the namelist. This avoids the need to type the same
set of names repeatedly. For example, the following commands greatly
simplify testing the inclusion of variables in a regression equation.
GROUP basic_variables GNP M1 TAXES GOV_EXP PRIME
ON GROUP
REGRESS DEFICIT basic_variables PARTY
REGRESS DEFICIT basic_variables TIME
... etc
You can also reference individual elements within a GROUP by index
number. For example, you could reference "basic_variables(2)" in place of
M1 in the example given above. Referencing individual namelist elements by
index number is particularly useful in DO loops.
14
2.4 Selection of the Observation Set
Periodicity and length of data series are defined by the USE command
in SORITEC. The USE period defined by this command is active in all
subsequent SORITEC commands until explicitly changed by another USE. Data
need not be continuous over the range of observations, but instead may
consist of a series of intervals. The form of the USE command is:
USE [begin_1] [end_1] [begin_2] [end_2] ...
USE requires zero, one or an even number of arguments which may be positive
integers, constants, parameters or a vector. Each pair of arguments de-
fines a range of observations within the overall observation range,
"begin_1" to "end_n". The second argument must not be less than the first.
If no arguments are included in the command line, SORITEC returns the
currently active USE period. If only one argument is included in the
command line, the end period is implicitly equated to the first.
SORITEC allows you to define annual, semi-annual, quarterly, monthly,
ten day, weekly, daily and undated data types. Periodicity of time-series
data is defined by appending an appropriate suffix to the data year, as
shown in the following table.
PERIODICITY SUFFIX RANGE(x)
----------- ------ --------
Annual none --
Semi-annual Sx [1,2]
Quarterly Qx [1,4]
Monthly Mx [1,12]
Ten Day Tx [1,37]
Weekly Wx [1,52]
Daily Dx [1,366]
Undated none [1,1000]
The permissible range of years in dated data types is 1901 to 2100.
Note that Ten Day data consists of first and second ten-day periods of the
month, and a remaining period of 8, 10 or 11 days. Weekly data span Sunday
through Saturday. SORITEC Sampler will convert data series from type to
another, but certain restrictions apply. Data conversion is discussed in
Section 6.4.
The following are examples of USE commands:
USE 1980q1 1984q4; USE 1942m12 1955m6
Note that the command USE 1980 is equivalent to USE 1980 1980.
2.4.1 Conditional Selection of the Observation Period
SORITEC also permits conditional selection of the sample period based
on a logical variable. The format of the command is
USEIF series
15
where "series" is an indicator series. The USEIF command resets the USE
period to select only the observations corresponding to non-zero entries of
"series". For example, to run a regression on all individuals with income
between $12,000 and $24,500
NEW_SAMPLE = INCOME > 12000 .AND. INCOME < 24500
USEIF NEW_SAMPLE
2.5 Transformations
The COMPUTE command is the basic SORITEC transformation command. The
command line consists of the COMPUTE command name followed by one argument,
which must be an EQUATION name or any legal SORITEC transformation expres-
sion, i.e.,
COMPUTE equation_name
or
[COMPUTE] transformation_expression
In the latter case, the COMPUTE command name may be omitted, e.g..
result = var_1 + var_2
Transformations are straightforward in SORITEC as syntax considerations
conform to standard algebraic notation. Legal operators in SORITEC
transformations are as follows:
ARITHMETIC OPERATORS LOGICAL OPERATORS
-------------------- -----------------
+ add .eq. equal
- subtract .ne. or <> or >< not-equal
* multiply .ge. or >= or => greater-or-equal
/ divide .le. or <= or =< less-than-or-equal
** exponentiation .gt. or > greater-than
.lt. or < less-than
.not. negation
.and. and-function
.or. or-function
Transformations can contain any of the mathematical functions listed
below.
LOG SINH Hyperbolic Sine
or ALOG Natural Logarithm COSH Hyperbolic Cosine
or LN TANH Hyperbolic Tangent
ASINH Hyperbolic Arcsine
ALOG10 ACOSH Hyperbolic Arccosine
or L10 Logarithm Base 10 ATANH Hyberbolic Arctangent
EXP Exponential constant CEILING Next Largest Integer
ABS Absolute Value FLOOR Next Smallest Integer
ROUND Round to Nearest Integer
16
SIN Sine
COS Cosine SIGN Extract Sign (+1,0,-1)
TAN Tangent TRUNC Truncate Fractional Part
ASIN Arcsine
ACOS Arccosine
ATAN Arctangent
Arguments associated with these functions must be enclosed in paren-
theses. Note that there is no SQRT function in SORITEC. Use the more
general form var**0.5 instead.
Use of operators in SORITEC transformations must conform to the
following conventions.
(1) Two operators (+,-,.and.,.or., etc.) cannot occur in
sequence unless separated by one or more open
parentheses.
(2) The number of open and closed parentheses must be
equal.
(3) The mathematical operators "*", "/" and "**" cannot
occur immediately after an open parenthesis.
(4) An operator cannot occur immediately before a closed
parenthesis.
Transformations are parsed according to standard programming conven-
tions. Therefore, subformulae in parentheses are evaluated first, followed
by all function evaluations, then all "**" operations, then all "*" and "/"
operations, and lastly all "+" and "-" operations. Logical operators are
evaluated after parentheses and mathematical operators. Within this group,
mathematical comparisons (.eq., .ne. or <> or ><, .ge. or >= or =>, .le.
or <= or =<, .gt. or >, .lt. or < ) are evaluated first, followed by
logical negation (.not.), and lastly by .and. and .or.. When in doubt about
the order of evaluation, use parentheses to avoid errors.
Note that you can combine mathematical and logical operations in a
single transformation. This allows complex conditional structures to be
imbedded directly into equations and expressions in a highly flexible
manner. For example, the expression y=log(x)*(b.gt.1)+x*(b.le.1) is a
legal SORITEC transformation. The logical portion of the expansion is
merely evaluated to 1 or 0 and then used in the computation.
SORITEC Sampler does not handle some illegal transformations
gracefully and, in these situations, can terminate sessions abruptly by
exiting to DOS. For example, the transformation:
A = (=)
"crashes" the system and returns to the DOS command level. As all active
items in SORITEC's workspace are irretrievably lost in this situation, you
should avoid entering nonsense into SORITEC commands. Most common errors
in transformations, such as unbalanced parentheses, however, cause warning
statements to be issued but keep the current SORITEC session active.
17
2.6 Revising Data in SORITEC
Data series may be extended or revised easily in SORITEC using the
REVISE command and the USE command. The format of the command is similar
to the COMPUTE command, i.e.,
REVISE transformation_expression
A data item being REVISEd must have been previously defined in SORITEC.
The command cannot be used to initialize the variable.
REVISE updates a variable by temporarily deactivating values for the
variable that lie outside the range of the currently active USE period. In
other words, to update a data series you must first define the observations
of the series that you wish to revise with the USE command before changing
the data with the REVISE command. For example, revision of the third
observation of an undated series "old_data", defined below, requires the
following commands to generate the series on the right:
OLD_DATA
................
FILL old_data 1 2 3 4 5 .
USE 3 3 1 . 1.00000
REVISE old_data=3.5 2 . 2.00000
PRINT old_data 3 . 3.50000
4 . 4.00000
5 . 5.00000
Since any legal transformation is permitted as an argument, the right
hand side of the equation can be a constant, time-series or other valid
SORITEC expression. Revision of the third and fourth observations of the
original "old_data", for example, requires the following commands to pro-
duce the output on the right:
OLD_DATA
USE 3 4 ................
FILL new_data 4 5 .
REVISE old_data = new_data - 1.5 1 . 1.00000
USE 1 5 2 . 2.00000
PRINT old_data 3 . 2.50000
4 . 3.50000
5 . 5.00000
Extending a data series by one or more observations simply requires
redefining the USE period to the period you wish to update and revising the
data as before. For example, the output on the right is produced by the
following commands:
18
OLD_DATA
USE 6 6 ................
REVISE old_series = 6 .
USE 1 6 1 . 1.00000
PRINT old_series 2 . 2.00000
3 . 3.00000
4 . 4.00000
5 . 5.00000
6 . 6.00000
A similar procedure is used when splicing two series together. For
example, the command sequence on the left splices observations 6 through 10
of "new_data" to the original five observations of "old_data".
OLD_DATA
USE 6 10 ................
FILL new_data 6 7 8 9 10 .
REVISE old_data = new_data 1 . 1.00000
USE 1 10 2 . 2.00000
PRINT old_data 3 . 3.00000
4 . 4.00000
5 . 5.00000
6 . 6.00000
7 . 7.00000
8 . 8.00000
9 . 9.00000
10 . 10.0000
Data revision can also be automatically implemented through the
COMPUTE and FILL commands by enabling the ON REVISE global option. (The
FILL command is described in Chapter 3.) Values for data in the currently
active USE period are overwritten when these commands are executed, but
values outside the USE period are retained, until an OFF REVISE command is
encountered.
2.7 Missing Data Handling
In general, SORITEC does not do casewise or any other type of dele-
tion when it encounters MISSING data. Instead, an error message is
printed and zero is used in all contexts except transformations. An excep-
tion to this rule occurs in the cross-sectional procedures. Here, categori-
cal techniques treat missing data as a separate category while SYNOPSIS,
non-parametric and other statistical techniques ignore missing values.
Several enhancements to missing value handling have been added to
SORITEC.
19
(1) SORITEC generates a MISSING value in transforma-
tions that involve MISSING data, except when
MISSING data are multiplied by zero. Here, a zero
value for the transformation results.
(2) The PUNCH command now generates the word 'MISSING'
for each missing value.
(3) The READ command now recognizes the words 'MISSING'
and 'NA' in input data.
(4) A MISSING command has been added that allows you to
assign a missing value to a SORITEC constant.
(5) A LEGAL function has been added that scans a data
item for missing values.
The operation of the MISSING command and LEGAL function are described
below.
2.7.1 Missing Value Symbol Declaration
SORITEC constants can be assigned missing values with the MISSING
command. The syntax of the command is:
MISSING constant_name
The argument "constant_name" is defined to be a SORITEC constant with the
value MISSING assigned. Only one argument is permitted in the command
line. Regardless of its prior type, the argument is always redefined as
a SORITEC constant. MISSING cannot assign a missing value to any other
variable type. You can, however, assign missing values to other variable
types using the REVISE command, as the following example shows.
The commands... yield
USE 1 3 SERIES
FILL SERIES 1 2 3 .............
USE 3 1 . 1.00000
MISSING X 2 . 2.00000
REVISE SERIES=X 3 . MISSING
USE 1 3
PRINT SERIES
2.7.2 Missing Value Logical Function
The LEGAL function returns the value 1 if a data item is not
MISSING and zero otherwise. This enables easy conversion of MISSING
values to another value.
20
2.7.3 Imputation of Missing Values
SORITEC Sampler provides four options for replacing missing values.
Missing values may be substituted by zero, the series mean, the interpo-
lated value or the trend forecast. The option is set globally by the
IMPUTE command, i.e.,
IMPUTE [ZERO|MEAN|INTER|TREND|NONE]
Normal missing value processing is resumed when the NONE option is
executed. Entering the command IMPUTE with no arguments returns the option
currently in effect. The details of each option are as follows:
ZERO substitutes 0 for each missing observation
MEAN replaces each observation with the mean of the
series during the current use period
INTER interpolates the range between the last two known
non-MISSING values over the missing observations
TREND fills in missing values with the simple trend
forecast for the series over the current use period.
NONE stops implicit imputation of missing values
2.8 Wildcards
SORITEC now supports the '*' and '?' symbols as wildcard characters in
arguments. The wildcarding scheme is a simple way to reduce the time
spent typing and viewing output (e.g. from the SYMBOLS command,
described later in this chapter). Currently, wildcards are available for
use with the FORGET, GROUP, and SYMBOLS commands.
The rules for wildcard construction are simple. An asterisk repre-
sents zero or more alphanumeric characters and a question mark substi-
tutes for any single character. Commands which permit wildcards match all
the names in the local workspace against the wildcard pattern and expand
the command line appropriately.
The following examples explain wildcard processing in SORITEC. Assume
that the local workspace contains the variables X, XY, XXY, BBYB, BB,
ABXYZ, and ABXY. Then:
THESE WILDCARDS: WOULD REFERENCE THESE ITEMS:
* X, XY, XXY, BBYB, BB, ABXYZ, ABXY
? X
B* BBYB, BB
*B* BBYB, BB, ABXYZ, ABXY
?B?? BBYB, ABXY
21
2.9 Options
Several global options are available to control the amount of
printing, depth of analysis, etc. These options are enabled and disabled
by the ON and OFF commands. For example, the command ON PLOT will cause
residual plots to be produced when an equation is estimated. A complete
list of available options with current settings will be displayed by
SORITEC if an ON command is entered with no arguments. Global options in
SORITEC Sampler with their default settings are described in Appendix II.
After every ON or OFF command which changes an option, an inter-
nal result called ^FLAGS is stored as a vector. ^FLAGS contains
information on the global options which are in effect immediately after the
ON or OFF command is executed. It can be RECOVERed, retained in SORITEC's
workspace or stored in SORITEC databanks, and can later be used to restore
global options to settings that were in effect when they were recovered.
Global options are restored with the FLAGS command, which has the
format:
FLAGS flag_vector
The argument "flag_vector" is the name of the vector to which the RECOVERed
SORITEC internal variable ^FLAGS has been written.
Note that flag vectors must not be changed in any way, or
unpredictable results may occur. The FLAGS command exists solely to
restore previous global option settings. Furthermore, the ordering and
number of the global options is subject to change in future releases so
flag vectors stored on SORITEC databanks may not restore the options
desired if retrieved by a later release of SORITEC.
2.10 Recovering Internal SORITEC Variables
The RECOVER command allows the user to access and manipulate secondary
results which have been generated and stored under internal names by
SORITEC commands. Either one or two arguments are associated with the
command, which has the syntax:
RECOVER [name] internal_name
The "internal_name" is an internal system name which identifies which
secondary result to RECOVER from SORITEC for later use. Legal system names
of secondary results that can be recovered are given in Appendix I. The
first argument, "name", is optional and is a user-defined name assigned to
the recovered item. If omitted, the recovered name is identical to the
internal system name.
In addition to the RECOVER command, SORITEC allows you to directly
reference internal system names by prefixing an up-carat (^) to the
variable name. For example, the commands:
22
RECOVER fitted_values yfit
and
fitted_values = ^yfit
would both recover the fitted values of the dependent variable and copy
them into the variable named "fitted_values". SORITEC internal system
names can be referenced directly in most situations. For example, parame-
ters and time-series variables that are internal system names can be reas-
signed with the SET command and can be referenced in transformation opera-
tions. Equations, matrices, vectors and GROUPS can also be referenced.
However, reassignment still requires the use of the RECOVER command.
Internal names cannot be saved to a databank without being reassigned to
another variable.
SORITEC will not confuse its own internal system names with variables
or other identically-named data items that the user has defined in his/her
program. The type of the first argument (variable, vector, constant, or
other SORITEC form of data organization) is automatically defined or rede-
fined to the type required by the second argument.
Secondary results need not be recovered immediately. All such results
remain available until a command is executed which stores other results
under the same internal system name. In that event, the prior results held
under that internal system name are lost. Note that some intermediate
results are retained under internal system names only if the user sets
appropriate flags with the ON command. Check the default switch settings
associated with each command to ensure that intermediate results are auto-
matically saved.
Note that some intermediate results are retained only if the
appropriate flags are set by the ON command. The internal variables and
global options that enable them are:
Internal Description Flag Setting
Name to Save Value
^CCOR Coefficient Correlation Matrix OFF NOMATS
^VCOV Coefficient Covariance Matrix OFF NOMATS
^XTABLE Crosstabulation Table OFF NOMATS
^RAWEQ Raw Forecasting Equation ON RAWEQ
See Appendix II for the default values for these options.
2.11 SORITEC'S Symbol Table
Any time during an interactive or batch session you can determine what
item names are currently active in SORITEC's workspace by examining the
symbol table. SORITEC's symbol table is listed on the output device when
the command:
SYMBOLS [ALL]
is entered. The symbol table lists each item's name, storage address, item
type and length. Including the optional keyword "ALL" in the command line
23
print all currently active SORITEC internal names in addition to user-
defined items. SYMBOLS accepts wildcards so that a selective search of the
symbol table can be made.
Items can be removed from SORITEC's symbol table by invoking the
FORGET command which is of the form:
FORGET [item_1] [item_2] ... [item_n]
Each "item_i" is a currently active item in SORITEC's workspace, as identi-
fied from the symbol table. The command may have up to 100 arguments.
FORGET accepts wildcards so that selected items from the symbol table can
be removed. For example,
FORGET ab*
removes all items that begin with the characters "ab" from the symbol
table. All items from the symbol table are removed by entering the wil-
dcard symbol "*" in place of item names, i.e.,
FORGET *
FORGET does not affect the contents of attached SORITEC databanks nor does
it return databanks.
Note that FORGET erases item names from the SORITEC symbol table but
does NOT remove the data from the workspace. If you exceed the workspace
limitation, FORGETting items from the symbol table will not free the stack
space they occupied. You must QUIT the current session and re-invoke
SORITEC from DOS to free the workspace.
2.12 Minor Control Statements
Several commands alter default settings other than those identified
with global options (ON/OFF) or pass information to SORITEC for use in
output listings.
2.12.1 Specify Width of Output Device
The width of output from SORITEC Sampler can be adjusted using the
WIDTH command, i.e.,
WIDTH number
The argument, "number", must be a numeric value between 50 and 150. Argu-
ments outside this range will generate an error message, leaving the pre-
vious WIDTH definition intact. The default value for interactive usage is
80 characters; in batch mode, the default value is 132 characters.
2.12.2 Change Length of Input Line
The length of the input line that SORITEC Sampler can accept may be
changed by the SCAN command, which has the format:
24
SCAN number
The argument, "number", must be a numeric quantity between 50 and 150.
Arguments outside this range will cause an error message and the existing
SCAN will remain in effect. The default value for scan in interactive and
batch modes is 80 characters.
2.12.3 Reset Maximum Error Limit
The maximum error limit can be reset in SORITEC batch jobs to alter
the number of NONFATAL and SERIOUS errors a job can commit before the batch
processor abandons compilation and execution. The syntax of the command
is:
MAXERR number
where "number" is a numeric quantity that defines the new error limit. The
default setting for MAXERR is 25.
2.12.4 Turn Batch Listing On or Off
Listings of batch job commands are turned on or off by the ONLIST
and OFFLIST commands, respectively. The default setting is OFFLIST.
2.12.5 Label Batch Output Pages
Up to 120 characters of label information can be printed to a SORITEC
Sampler batch job listing using the TITLE command. The syntax of the
command is:
TITLE [label]
The output, "label" will appear on the third line of each output page,
following the JOB statement. A TITLE command with no argument causes the
third line of succeeding pages to be blank. Title labels may not contain
the symbols ; , $ , or &. As many TITLE commands as needed can be placed
in a job. They are executed as they are encountered in the job stream and
label all succeeding pages until another TITLE command is executed.
25
Chapter 3
Data Entry and Output
3.0 Introduction
Data may be imported or exported to or from SORITEC Sampler in several
formats, including SORITEC Alternate Load (SAL) files, DIF files, FORTRAN
formatted files, SORITEC Database Files (SDB), and keyboard entry. In
addition, data may be displayed at the terminal either in tabular or
graphical format. This section describes the available data input and
output options with detailed descriptions of the syntax and examples.
The most common mistakes that users make with data entry are (a)
forgetting to move the file into the current working directory, (b) forget-
ting to add the correct file extension to the file when it is created, or
(c) using a file extension in SORITEC. In the latter case, SORITEC Sampler
always appends the appropriate file extension to the file name so that you
need not specify the extension in SORITEC Sampler file manipulation com-
mands. If you specify a SAL file as READ(MYFILE.SAL), SORITEC Sampler will
look for MYFILE.SAL.SAL. On the other hand, READ(MYFILE) will not execute
if you have forgotten to append a .SAL extension to the name of the stored
DOS file that you want to read.
3.1 SORITEC Alternate Load (SAL) Files
SAL files are the easiest way to import large amounts of data into
SORITEC Sampler. They are also a convenient means of exporting data,
particularly if you want to move data to SORITEC on another (non-DOS)
computer. SAL files are essentially free-field ASCII files with a special
header. If you already have data in a tabular format, you can quickly
create a SAL file by editing the table with any standard text editor or
word processor. SAL files are composed of three parts, (1) the header, (2)
the data, and (3) the data terminator.
The header conveys information necessary for SORITEC Sampler to
correctly read the data. Two commands are used to define the header sec-
tion. The first is the USE period which tells SORITEC Sampler what time
period the data spans. This is followed by the READ command which tells
SORITEC Sampler what variable name to assign to the data. The final item
following the data is a ';' that delimits each data section in a data file.
The final line in a SAL file is an END statement that tells SORITEC Sampler
to expect no more data for the READ statement being executed.
An example demonstrates the structure of a SORITEC SAL file. We wish
to import the following data into SORITEC Sampler:
26
YEAR GNP TAXES PRIME
1970 1423.5 455.6 10.75
1971 1564.2 678.3 9.76
1972 1688.9 778.4 13.45
The following file, named MACRO.SAL (SAL files must end with a .SAL file
extension), is a valid SORITEC SAL file.
USE 1970 1972
READ GNP TAXES
1423.5 455.6 1564.2 678.3
1688.9 778.4 ;
READ PRIME
10.75 9.76 13.45 ;
END
SAL files can contain any number of data series. Furthermore, data sec-
tions (sections of a SAL file delimited by an END statement) can be stacked
as necessary and imported using multiple reads (or exported using multiple
writes) in SORITEC Sampler. More than one variable can be input with a
single READ. The USE period can be changed as often as necessary to
conform to the data.
3.1.1 SAL File Input
SAL files are imported into SORITEC Sampler using the READ command
which has the format:
READ(filename)
As the USE period and all variable names are already predefined in the SAL
file headers, no further information is needed. If referenced simply as
above, the SAL file, "filename" must exist in the current directory with
the filename "filename.sal". If the SAL file exists on a drive or
directory other than the current one, it must be referenced within single
quotations, i.e.
READ('d:filename')
or
READ('\path\filename')
A READ command imports data from a SAL file until it encounters an END
statement. A later READ of the same file would then begin importing data
following this delimiter until the next END statement is reached, and so
on. No section of a SAL file can be re-read, since the file is sequen-
tially organized.
3.1.2 SAL File Output
Data may be exported from SORITEC Sampler in SAL file format using the
PUNCH command. The format of the PUNCH command is:
PUNCH series_1 series_2 ...
27
PUNCH creates a SAL file named PUNCH1.SAL in the current directory or
drive. Salfile data may not be directed to any other file name from within
Sampler.
Before writing data to a SAL file the desired USE period MUST be in
effect, and data series on the command line must be of the same periodici-
ty. SORITEC Sampler appends the extension .SAL to the file when it is
opened. If PUNCH1.SAL already exists in the directory, SORITEC Sampler
will over-write the existing file with the new one. Note that SAL files
remain open until closed by a QUIT command. Multiple PUNCH commands to the
same file will therefore append the data to the referenced SAL file. An
END delimiter is appended to the file when it is closed.
3.2 Data Interchange Format (DIF) Files
The Data Interchange Format (DIF) file format has emerged as a de-
facto standard for exchanging data between popular PC packages such
as LOTUS 1-2-3, DBASE II, SUPERCALC, and various stand-alone graphics
packages. Because of this, SORITEC Sampler has been equipped with DIF
file input and output facilities. The format of the DIF commands is
subject to change in future SORITEC releases.
3.2.1 DIF File Input
SORITEC Sampler imports DIF files through the READDIF command. There
are two forms of the READDIF command. If variable names are in the DIF
file, then the command is simply:
READDIF(filename)
If variable names are not in the DIF file, the command line is:
READDIF(filename) series_1 series_2 ...
SORITEC Sampler supports subdirectory addressing within the filename
reference. If the DIF file exists on a drive or directory other than the
current one, it must be referenced within single quotations, i.e.
READDIF('d:filename') [series_1 series_2 ...]
or
READDIF('\path\filename') [series_1 series_2 ...]
READDIF does not read dates in DIF files so an appropriate USE period
must be in effect before the command is executed.
At this writing, READDIF expects to find ONLY time-series data in
the input DIF file. Any spreadsheet cells that do not contain legal num-
bers are interpreted as 'MISSING' values by SORITEC Sampler. As a
consequence, SORITEC-generated DIF files that contain data other than
time-series and that are later read by SORITEC Sampler will NOT generally
produce useful results.
28
There are two ways that data can be organized in LOTUS to pass it
to SORITEC Sampler: with and without labels. In either case, the data are
interpreted under the currently active USE period in SORITEC Sampler. The
USE interval is never derived from a DIF file's contents.
If the columns are to be labeled, the names must appear in ROW 1 and
if the rows are to be labeled, the names must appear in COLUMN A. For
example, if the following worksheet is written to 'NATIONAL.DIF' using the
LOTUS translate function:
A B C D
+------------------------------------
1 | GNP TAXES PRIME
2 | 1423.5 455.6 10.75
3 | 1564.2 678.3 9.76
4 | 1688.9 778.4 13.45
then NATIONAL.DIF can be read into SORITEC Sampler using the commands:
USE 1970 1972
READDIF(NATIONAL)
READDIF can read variable names up to 32 characters in length.
The unlabelled method is less convenient because correct variable
names must be specified in the READDIF command. In the following example,
READDIF assumes that the desired variables are stored in column order.
If column D was not empty and the USE specified four observations, then the
data would be interpreted in row order. The following table written
from LOTUS to the file NATIONAL.DIF:
A B C D
+------------------------------------
1 | 1423.5 455.6 10.75
2 | 1564.2 678.3 9.76
3 | 1688.9 778.4 13.45
can be read into SORITEC Sampler with the commands:
USE 1970 1972
READDIF(NATIONAL) GNP TAXES PRIME
with the same results as in the labelled example.
Input data outside the current USE interval are ignored. If insuffi-
cient data exist to satisfy the current USE period, the remaining
observations are set to 'MISSING'. READDIF tries to do something
reasonable with any input DIF file by first considering the current
USE interval, then examining the DIF file contents. One should spot-
check READDIF input results to ensure that the rows and columns are inter-
preted as intended.
29
3.2.2 DIF File Output
DIF files may be exported from SORITEC Sampler using the PUNCHDIF
command. This command has the format:
PUNCHDIF[(filename)] arg_1 arg_2 arg_3 ... arg_n
where the arguments may be time-series, parameters, constants, vectors
or matrices. Variable names in the argument list can be no longer than 10
characters. Otherwise, longer names are truncated. SORITEC Sampler
creates a file called 'filename.DIF' which can be translated into a LOTUS
worksheet using LOTUS' translate utility. If the filename is omitted,
SORITEC Sampler creates a file named PUNCH1.DIF. You can redirect DIF file
output to a file on another drive or directory other than the current one
using the same conventions as the READDIF command.
Note that the following rules apply:
(1) Only observations active under the current USE com-
mand are written to the file.
(2) PUNCHDIF re-orders its arguments (if required) so
that all SERIES are written first, followed by
CONSTANT items, and lastly, VECTOR items.
(3) PARAMETERS are output as CONSTANTS.
(4) MATRICES are output as VECTORS with M * N elements.
(5) SORITEC 'MISSING' values are output as 'NA'.
Most of these considerations are demonstrated by the following example:
USE 1984Q1 1984Q3
FILL GNP 1423.5 1564.2 1688.9
FILL TAXES 455.6 678.3 778.4
FILL PRIME 10.75 9.76 13.45
SET CONST=35.
CONSTANT CONST2 223
PARAMETER C3
VECTOR VVV 1 2 3
VECTOR V2 4 3 2 1
USE 1984Q2 1984Q4
PUNCHDIF(ADIFFILE) V2 VVV C3 CONST2 CONST &
GNP TAXES PRIME
'ADIFFILE.DIF' is created and results in the following spread-
sheet after being read into LOTUS 1-2-3:
30
A B C D E F
+--------------------------------------------------------
1 | TIME GNP TAXES PRIME
2 | 1984Q2 1564.2 678.3 9.76
3 | 1984Q3 1688.9 778.4 13.45
4 | 1984Q4 NA NA NA
5 |CONSTANT C3 0
6 |CONSTANT CONST2 223
7 |CONSTANT CONST 35
8 | VECTOR VVV 1 2 3
9 | VECTOR V2 4 3 2 1
3.3 Formatted Input and Output
SORITEC Sampler supports formatted input and output of data and text.
The command syntax for formatted I/O is similar to FORTRAN formatted I/O.
In other words, the read or write statement refers to a FORMAT statement
number that contains the format for the input or output.
The FORMAT command has a statement number, the command name FORMAT and
a legal format specification, i.e.,
statement_number FORMAT format_specification
The statement_number is always a positive integer between 1 and 9999. It
must be unique within any given session or batch job. In other words, once
a FORMAT is entered and identified by a statement number, no other command
can have the same command number during that session. Allowable
"format_specifications" are identical to those permitted in FORTRAN
programs. Consult any FORTRAN reference manual for details on FORMAT
statements.
3.3.1 FORTRAN Formatted Input
Although free-format SAL files are the preferred way to import data to
SORITEC Sampler, there may be occasions when data are structured so that it
is necessary to use an explicit format statement. Standard FORTRAN-style
format statements are used. Sampler can read formatted data directly from
the terminal or from a file. The syntax for reading formatted data is:
READ([filename] [statement_number]) series_1 series_2 ...
Here, the "statement_number" refers to a previously defined format state-
ment. The optional data file identified by "filename" must have a .SAL
file extension. If omitted, SORITEC Sampler reads the data from the
current input device, i.e. the terminal or a SAC file if a command file is
being executed. If the format statement number is omitted, data are
assumed to be free-formatted.
Input file redirection is supported by the READ statement so that you
can read a formatted file from a drive or directory other than the current
one if it is referenced within single quotations, i.e.,
31
READ('d:filename' statement_number) series_1 series_2 ...
or
READ('\path\filename' statement_number) &
series_1 series_2 ...
Unlike regular SAL files, formatted files cannot be read by multiple
READ statements; all data from the file must be imported at one time.
Normally, formatted READ commands expected data to be organized in columns.
However, if the STREAMIO option is enabled by the ON STREAMIO command, data
can be read by rows. For example, to read the text file MACRO1.SAL,
including the headers, given below:
KEY MACROECONOMIC INDICATORS
1970 1971 1972
GNP 1423.5 1564.2 1688.9
TAXES 455.6 678.3 778.4
PRIME RATE 10.75 9.76 13.45;
the following command sequence would be required:
ON STREAMIO
USE 1970 1972
101 FORMAT(///10X,3F8.1)
READ(MACRO1 101) GNP
102 FORMAT(10X,3F8.2)
READ(MACRO1 102) TAXES
READ(MACRO1 102) PRIME
Although this is almost as straightforward as for standard SAL file
input, a FORMAT statement used and reference to the FORMAT statement number
is made in the READ statement. Also unlike standard SAL file reads, you
must explicitly reference the variable list in the READ statement and the
USE period must be set in the main program before the READ command is
executed. The file must still be terminated with a ";" delimiter.
3.3.2 FORTRAN Formatted Output
Data and text may be printed in a prespecified format by the WRITE
command. FORTRAN formatted output can be directed to either the terminal
or a file. The general format for the formatted write command is:
WRITE([filename] [statement_number]) var_1 var_2 ...
The statement number refers to a previously defined FORMAT statement.
If the optional "filename" is included, SORITEC Sampler writes the data
according to the format statement associated with "statement_number" to the
file "filename.LST". Otherwise, the data are written to the terminal or
the current output device if DOS redirection has been invoked. If the
statement number is omitted, data are printed in a list format similar to
the format used to PRINT variables at the terminal, e.g.,
32
VAR_A
................
.
1 . 1.00000
2 . 2.00000
3 . 2.50000
4 . 3.50000
5 . 5.00000
Variables in the variable list may be time-series, constants or parameters.
Up to 100 variables are allowed in a variable list.
When time-series or vectors are encountered in the variable list,
SORITEC Sampler writes all active observations to the terminal before
writing the next variable in the list. Placing parentheses around time-
series variables in the variable list, however, will direct SORITEC Sampler
to print one value from each variable in turn, allowing you to print time-
series in columns.
WRITE([filename] statement_number) constant_1 &
(time_series_1 time_series_2) constant_2
For example, the commands:
USE 1973Q1 1973Q4
102 FORMAT(15X,' GNP CONSUMPTION INVESTMENT'//10X,(3F11.1))
WRITE(102) (gnp consump invest)
produce the following output.
GNP CONSUMPTION INVESTMENT
475.7 301.4 71.0
468.3 306.2 70.1
487.7 312.8 82.3
490.7 320.8 65.6
Constants and parameters cannot be included in parentheses.
3.4 Keyboard Entry
Data may be entered directly from the keyboard using the FILL command,
which has the format:
FILL variable_name value_list
where "value_list" is the set of values assigned to the variable
"variable_name". For example,
FILL VAR_A 1 4 2 5 7 8
creates a new series VAR_A with the six specified values.
When there is no USE command in effect, a FILL command counts the data
items, stores them as undated data and defines an appropriate USE interval
33
which is assumed in later commands or until the USE period is redefined.
If there are too many or too few observations entered for the current USE
period, an error message is generated unless the ON RAGGED option is
enabled. The option command ON RAGGED permits entry, through FILL, of data
series that are shorter than the current USE interval without generating an
error. Unaccounted data are assigned MISSING values when this condition is
encountered. FILL will not accept data series longer than the current USE
period under any circumstances. FILL is commonly used to enter data series
that consist of few observations or to extend current data series.
3.5 Output of Data to the Terminal
Data may be output to the terminal in both tabular and graphical form.
If necessary, tables and graphs can be routed to the printer by using the
DOS "Ctrl-P" switch before entering the appropriate command.
3.5.1 Tabular Display
The simplest data display is produced by the PRINT command. Any data
series, vector, constant, parameter, equation or GROUP can be displayed
using this command, which has the form:
PRINT arg_1 arg_2 arg_3 ...
Types of arguments to be printed may be mixed, but this is generally
inadvisable. Since SORITEC does not put unlike items on the same lines,
mixing types or periodicities indiscriminately can generate lengthy out-
puts. The PRINT command can have up to 100 arguments, each of which must
be a legal SORITEC name. Lagged variables may be specified in a PRINT
command. To display data from the members of a GROUP, the ON GROUP option
must be active. PRINT displays the names of GROUP members if OFF GROUP is
enabled.
Data may be output to the terminal in specified formats and mixed with
text using the WRITE command. Refer to Section 3.3.2 for a description of
this command.
3.5.2 Graphical Display
Two types of graphical displays are available from SORITEC Sampler.
Both produce line printer-style graphics. SORITEC's estimation commands
can also produce medium resolution residuals plots on systems with color
graphics capability. These are discussed in Section 10.2.11.
Multi-variable plots of time-series or cross-section data are
generated by the PLOT command, which has the form:
PLOT series_1 symbol_1 series_2 symbol_2 ...
The PLOT command produces a line printer plot of observation number against
up to nine variables at once. Plotting symbols must be specified in the
command line for each variable to distinguish plotted values. Plotting
34
symbols may be alphanumeric (A-Z, 0-9) or the characters +, -, * , /, =.
If two variables, at some observation, are nearly equal so that they
occupy the same position on the screen, only the symbol for the latter-
named variable is displayed. The horizontal scale is determined automati-
cally so that all data values can be plotted. The WIDTH command can be
used to inform SORITEC Sampler that more (or less) than 72 characters can
be output on a single line. In this case, the width of the plot is
adjusted accordingly, e.g., WIDTH 132.
To generate meaningful output, all plotted variables should have
roughly the same range of values. Otherwise, some multiplicative or addi-
tive scaling may be necessary.
35
The relationship between two variables can be illustrated graphically
via the SCATTER command, which is specified as:
SCATTER series_1 series_2
SCATTER generates a scatter diagram with the variable referenced in the
first argument plotted with respect to the vertical or Y-axis and the
variable referenced in the second argument plotted against the horizontal
or X-axis. Lagged variables are permitted.
The graph size is dependent upon the number of characters that can
appear on a line. The default value is 72 but can be changed by the WIDTH
command.
3.6 SORITEC DataBank Files
SORITEC DataBank (.SDB) files are the most convenient means of acces-
sing data AFTER the data have been entered into SORITEC Sampler. The
databanking facility has its own set of commands for accessing and managing
data. These commands are described in the next chapter.
36
Chapter 4
SORITEC DataBank (SDB) Files
4.0 Introduction
SORITEC databanks are the key to using SORITEC Sampler efficiently.
SDB files can store data series, equations, matrices, vectors, scalars,
parameters, namelists and multiple equation models. SORITEC Sampler can
store an unlimited number of items if enough disk space is available.
Planned future enhancements include the ability to store and recall user
procedures, report formats, data descriptors and online "HELP" text.
SDB files are constructed in a "knapsack" database arrangement. In
effect, you can throw anything you want into an SDB file and the recall it
by name later. There is no need to specify the type of the data item, its
length, etc; SORITEC Sampler keeps track of that for you.
The commands necessary to create and manipulate SDB files are
straightforward and easy to learn. The complete list is as follows.
4.1 Create a Databank
CREATE constructs and initializes a SORITEC databank. The only argu-
ment in the command line is the name of the database that you want to
create. For example,
CREATE filename
will create a file called filename.SDB for future use. The CREATE command
creates the databank on the default drive and directory. However, the file
can be created on an alternative drive or directory by enclosing the drive
specification and filename in single quotations, e.g.
CREATE 'd:filename'
or
CREATE '\path\filename'
Once the database is created, it remains open for I/O until either (a) a
different database is accessed, (b) the file is RETURNed, or (c) SORITEC
Sampler is terminated.
4.2 Access a Databank
ACCESS opens a SORITEC databank for use in the current job session.
The general form of the command is:
ACCESS filename
37
The database must already exist in the current directory as "filename.SDB"
or an error message is generated. Once a database is ACCESSed, SORITEC
Sampler automatically copies the requested data items referenced in a
command into the workspace if it is not already there. ACCESS automatical-
ly returns any database which is currently open.
Databanks residing on drives other than the current drive may be
referenced by enclosing the drive designation and filename within single
quotation marks, as noted above for CREATE.
Depending on the implementation, there may be additional arguments to
the ACCESS command to specify special file formats (CitiBase for example),
passwords or read/write access.
4.3 Release a Databank from SORITEC
RETURN automatically closes any database which is currently open and
releases it from SORITEC's control. The format of the command is:
RETURN
No arguments are required with this command as only currently ACCESSed
databank is referenced. After the RETURN command, the database is no
longer accessible until another ACCESS command is executed.
4.4 Purge a Databank
Databanks may be purged from the DOS directory with the PURGE command.
The format of the command is:
PURGE filename
Since the database is permanently erased, this command should be used with
care! PURGE only works on SORITEC databases so it isn't possible to delete
an arbitrary file using this command. Reference to a database on a direc-
tory or drive other than the current one follows the same rules as the
CREATE and ACCESS commands.
4.5 Retrieve Items from a Databank into the Workspace
Data are explicitly copied from the currently accessed databank into
the workspace by the COPY command. The command syntex is:
COPY item_1 item_2 ... item_n
Arguments in the command line may be time-series, constants, parameters,
vectors, group names, and equations. Since the databank is always
implicitly searched for items needed by SORITEC commands, this command is
generally used only when you need to retrieve data from a second database.
If, for example, you wish to regress a measure of inflation, such as CPI,
stored on one database, against some measures of final demand, such as PCE
38
and DEFENSE, stored on another, the command sequence would be:
ACCESS inflate
COPY cpi
ACCESS fdemand
REGRESS cpi pce defense
4.6 Store Items in a Databank
Items in SORITEC's databank are stored on the currently-accessed
databank with the KEEP command. The syntax of the command is:
KEEP item_1 item_2 ... item_n
Each argument, "item_i", can be a data series, constant, parameter, equa-
tion, vector, group name or equation. If you try to KEEP an item that has
the same name as an item that already exists in the database, a non-fatal
error is reported and the item is not replaced.
There are three ways to replace an item that already exists on a
SORITEC databank. First, the item stored in the databank can be explicitly
discarded using the DISCARD command and then stored using the KEEP command.
Second, the item can be replaced explicitly with the REPLACE command.
Lastly, items in in a databank can be implicitly replaced with the KEEP
command if the ON REPLACE option has been enabled.
KEEP stores all observations associated with a given time-series,
regardless of the observation period, as defined by the current setting of
the USE command, that is currently active. For example, if the series GNP
is defined for 1950Q1 to 1984Q2 and the current USE period is for 1980Q1 to
1983Q4, the command KEEP GNP stores the series for 1950Q1-1984Q2. You may
save only the active observations by entering the command:
KEEP(ACTIVE) item_1 item_2 ... item_n
4.7 Replace Items in a Databank
Items in databanks are replaced by items of the same name in the
current workspace with the REPLACE command. The command syntax is:
REPLACE item_1 item_2 ... item_n
If the item is not currently stored on the database, a warning message is
generated but the item is still saved.
4.8 Rename Items in a Databank
The names of items in a SORITEC databank are changed with the RENAME
command, which has the form:
RENAME new_name_1 old_name_1 new_name_2 old_name_2 ...
39
RENAME takes an even number of arguments consisting of pairs of item names.
The command renames item old_name_i to new_name_i. Note that the ordering
of the pair is new_name, followed by old_name, which is reversed from
argument orders usually found in SORITEC.
4.9 Switch the Names of Two Items in a Databank
Pairs of items in a SORITEC databank can have their names swapped by
the SWITCH command. The syntax of the command is:
SWITCH item_1 item_2
It is equivalent to the series of commands:
RENAME temp item_1
RENAME item_1 item_2
RENAME item_2 temp.
4.10 Discard Items from a Databank
Items are erased from a databank with the DISCARD command. The format
of DISCARD is:
DISCARD item_1 item_2 ... item_n
Once DISCARDed, the item is irretrievably lost.
4.11 Generate a Directory Listing of a Databank
An alphabetically sorted directory listing of a SORITEC databank is
produced with the CONTENTS command, which has the form:
CONTENTS [filename]
If "filename" is omitted from the command line, SORITEC Sampler produces a
directory listing of the currently active databank. If no databank is
active, an error message is returned.
The optional argument "filename" is the name of a SORITEC database in
the current directory. Reference to a database on a directory or drive
other than the current one follows rules similar to the CREATE, ACCESS, and
PURGE commands.
Note that the command:
CONTENTS filename
attaches the named databank after returning the one currently attached. To
reference the previous databank, you must re-attach it with the ACCESS
command.
40
Chapter 5
Programming Constructs
5.0 Introduction
SORITEC provides a powerful interpretive programming language that
enables the user to simplify complex and repetitive estimation procedures
into a smaller set of commands that can be executed interactively or
through SORITEC's batch processing facility. SORITEC's programming lan-
guage supports numeric and alpha looping, and conditional and unconditional
transfer of control to other statements. When set up as a SORITEC Alterna-
tive Command (SAC) file, this programming language provides a convenient
means for developing more complex estimators and diagnostic statistics in
addition to those provided directly by SORITEC Sampler. The alternate
command file facility enables command files to call other command files so
that a series of command sequences can be executed. Note that command
files can be chained together but they cannot be nested. This means that
program control does not implicitly return to the command file from which
the call was made.
SORITEC also provides a PROCEDURE facility that allows you to
structure a sequence of commands into a subprogram that, once defined,
can be passed arguments and repetitively called, like a subroutine, from a
SORITEC command line. The PROCEDURE facility is not available in SORITEC
Sampler.
The commands associated with SORITEC Sampler's programming language
follow.
5.1 Numeric Looping
Repetitive execution of commands in SORITEC Sampler is accomplished by
DO loops. The DO loop has the following general format:
DO index = beginning_value TO end_value BY increment
.
.
(SORITEC Sampler commands)
.
.
END
The DO loop index, beginning_value, end_value and increment may be integer
or real scalars or parameters and you can proceed forward or backward
through the loop by assigning a positive or negative value to the incre-
41
ment. Both the end_value and increment may be reset dynamically within the
loop. If so, the new values are used to determine whether the loop is
executed again. If the BY increment is omitted from the DO command line,
it is set to 1. A DO command, with no specified values for
"beginning_value", "end_value" and "increment", will cause the statements
before the END command to be executed once.
If the DO variable's initial value exceeds its maximum value before a
positive increment is added, an error message is generated and the state-
ments between the DO and END statements are not executed. The same situa-
tion results if the variable's initial value is set lower than a final
value to be reached by negative increments.
You can construct a DO loop to index through members of a group. For
example, the commands:
GROUP group_name series_1 series_2 ... series_n
ON GROUP
DO i = 1 TO n
REGRESS y group_name(i)
END
would regress the dependent variable "y" against each of the time-series in
the group "group_name" successively.
5.2 Unconditional Branching
SORITEC Sampler allows you to transfer control to any command prefixed
by a statement number. The format of the command is simply:
GO TO statement_number
Alternatively, the command may be specified as GOTO.
Statement numbers may be numbers, CONSTANTs or PARAMETERs and must be
in the range 1 to 9999. They may be prefixed to most commands and FORMAT
statements, but not GO TO statements. Other commands that may not be
prefixed are:
JOB ONLIST
HELLO OFFLIST
SCAN MAXERR
WIDTH COMMENT
In batch mode, if the specified command number does not exist, an
error message is generated, and control passes to the statement which
follows the GO TO command. In interactive mode, the system responds with a
query for the missing statement number until the statement number is
entered.
42
5.3 Conditional Branching
Conditional branching is enabled through an IF/THEN/ELSE command
structure. The general format for the command sequence is:
IF condition; THEN; command_sequence_1; ELSE; command_sequence_2
A "condition" must be an arithmetic expression that may include logical and
relational operators, as needed. When the condition is satisfied, control
transfers to "command_sequence_1", otherwise control is transferred to
"command_sequence_2". The IF/THEN/ELSE sequence MUST be delimited by semi-
colons, as specified above. An IF/THEN/ELSE command structure CANNOT be
nested.
Command sequences in conditional branching statements may be composed
of a single command or a series of commands. If more than one command
comprises a command sequence, they must be structured in a DO loop, e.g.,
IF a > b; THEN; DO
c = b * log(a)
print a b c
END;
ELSE;
DO
c = a * log(b)
plot a # b *
END
Obviously, a DO loop in an IF/THEN/ELSE sequence can be executed
repetitively by specifying the index, initial value, final value and,
optionally, the increment in the DO command line.
Either the THEN or the ELSE clause may be omitted from a conditional
branching command sequence. The IF command can also be used with the GO TO
command to control the order of execution, e.g.
IF x < y .and. a > b; THEN; GO TO 300
5.4 Null (Continuation) Statement
The CONTINUE statement is generally used in SORITEC Sampler to posi-
tion a statement number within a SORITEC program. Its syntax is:
statement_number CONTINUE
As such, it is not executed.
5.5 Alpha Looping
SORITEC Sampler will repetitively execute a sequence of commands by
indexing over a set of alphabetic loop control variables. On each pass
through the loop, SORITEC Sampler supplies succeeding alphabetic arguments
in the DOT statment. The DOT statement is functionally similar to a DO
43
command. The format of the command is:
DOT variable_1 variable_2 ... variable_n
.
.
(SORITEC Sampler commands)
.
.
.
ENDDOT
Alpha loop control variables are successively entered into expressions
within the DOT loop by substituting all references to any colons (":")
within the DOT loop by the currently active alpha variable, i.e.,
DOT a b c REGRESS y a
REGRESS y : is executed as REGRESS y b
ENDDOT REGRESS y c
You may also use the colons as suffixes to construct new variables within
DOT loops, e.g.,
DOT var1 var2 var3 outvar1 = inpvar1 * z
out: = inp: * z is executed as outvar2 = inpvar2 * z
ENDDOT outvar3 = inpvar3 * z
The colon may not be used as a prefix, however. All commands in the DOT
loop are executed as many times as there are variables in the DOT command.
Note that if group expansion is enabled by the ON GROUP switch, a DOT loop
can index through a GROUP, i.e.
GROUP group_name var_1 var_2 var_3 ...
ON GROUP
DOT group_name
regress y :
ENDDOT
would regress the dependent variable, y, against each of the time-series in
the GROUP "group_name".
44
Chapter 6
Dummy Data Series Generation and Special
Transformation Commands
6.0 Introduction
SORITEC Sampler provides several commands that generate or transform
time-series. These commands create dummy variables or they transform
existing data series into new time-series. They include facilities for
converting time-series from one periodicity to another and for transforming
continuous into discrete variables. SORITEC Sampler also provides com-
mands that compute modular division and invoke maximum and minimum
functions.
6.1 Create a Time Trend Dummy Series
SORITEC Sampler generates a time trend dummy series with the TIME command.
The syntax of this command is:
TIME [series_name]
TIME sets the first observation of the "series_name" associated with the
currently active USE period equal to one and increments successive
observations by one, so that the second observation is set to two, the
third to three, etc. If the "series_name" is omitted from the command
line, TIME stores the time trend dummy in a series named "time". If a
variable by that name already exists in the workspace, it will be overwrit-
ten by the TIME command.
The TIME command may only be invoked when there are no internal gaps
in the current USE period, i.e., the current USE period must have been
invoked with only two arguments.
6.2 Create Seasonal Dummies
A periodic dummy variable can be created using the DUMMY command,
which has the form:
DUMMY output_series first_observation skip_increment
In the command line, "first_observation" is the first observation set to
one. Series elements are then set to one every "skip_increment. The
remaining values of the series are set to zero.
45
6.3 Recode a Variable
SORITEC Sampler allows you to convert a continuous variable into a
discrete variable via the RECODE command. The form of the command line is:
RECODE output_series input_series p(1) p(2) p(3) p(4) ...
In the above command line, "input_series" is the series to be recoded and
"output_series" is the categorized output variable. The p(i) are the
interval boundaries for the recoding process.
To show the RECODE function, the commands:
FILL a 3 17 21 28 31 35 26 41
RECODE b a 10 20 25 30 35 40
PRINT a b
produce these results.
A B
1 3 0
2 17 1
3 21 2
4 28 3
5 31 4
6 35 5
7 26 3
8 41 6
For each element, i, of the series, RECODE uses the following formula:
output_series(i) = k if p(k-1) =< input_series(i) < p(k)
when p(k-1) <> p(k), and
output_series(i) = k if p(k-1) = input_series = p(k)
p(0) is always considered to be -infinity, and p(n+1) (where n is the
number of p(i) in the command) is always considered to be +infinity.
6.4 Conversion of Time-Series from One Periodicity to Another
The periodicity of dated and undated time-series is converted by
SORITEC Sampler with the CONVERT command. The command has the following
syntax:
CONVERT [(modifier)] output_series = input_series
When the command is executed, data of one periodicity are converted to the
periodicity specified by the current USE statement. In other words, the
periodicity of the "input_series" does not have to be explicitly specified,
since SORITEC Sampler determines it internally.
Lags are not allowed in CONVERT arguments and the entire series is
always converted, regardless of the range specified in the USE command.
46
While the standard syntax of the convert command requires the specifi-
cation of both an output (result) series and an input series, the converted
series can be written to the input series name simply by specifying:
CONVERT [(modifier)] input_series
After the conversion, the old values of the input series, in the old
periodicity, are lost.
The modifier argument in the command line is optional, and controls
the type of conversion which takes place. There are two sets of modifiers,
one for aggregation (such as monthly to annual), and one for disaggregation
(such as annual to monthly), plus a special MOVE modifier for converting to
and from undated data. The modifiers are:
AGGREGATION
SUM Sum observations in each period (default)
AVERAGE Average observations in each period
MIN Find the minimum observation in each period
MAX Find the maximum observation in each period
LAST Use the last observation in each period
DISAGGREGATION
FILL Use the data point for entire period for each sub-period
SHARE Divide the data value for the entire period equally
across all sub-periods (default)
UNDATED TO DATED CONVERSIONS
MOVE Move the data from and undated to a dated variable or
vice versa without alteration (default)
Modifiers do not have to entered into the command line explicitly if the
default is selected.
Conversion is currently permitted only between annual, semi-annual,
quarterly, monthly, ten-day and undated data types. In addition, conver-
sion from monthly to ten-day periodicity produces incorrect results because
of the way the ten-day data type is defined. See Section 2.4 for
information on data types supported by SORITEC.
6.5 Maximum Function
SORITEC Sampler can determine the maximum of a series or can generate
a new series from several containing the maximum value associated with each
observation.
The maximum value of a series is found by entering the MAX command
with only two arguments, i.e.,
MAX maximum_value input_series
When entered like this, "input_series" is the data series over which the
maximum is to be taken. The result is stored in "maximum_value" which must
47
be a CONSTANT or PARAMETER. If the "maximum_value" name is undefined
prior to entering the command, SORITEC Sampler defines it to be a CONSTANT.
A new series consisting of the set of maximum values, by observation,
associated with several series is generated by the MAX command when more
than two arguments are entered in the command line, i.e.,
MAX output_series input_series_1 input_series_2 ...
In this case, all arguments in the command line must be data series. The
resulting "output_series" contains the observation-by-observation maximum
of all the remaining arguments. Up to 99 input series can be evaluated by
this command.
6.6 Minimum Function
The minimum value of a data series or a series of minimum values, by
observation, of several series is obtained using the MIN command. The
format and use of MIN is identical to the MAX command except for the result
it computes. In other words, the minimum value of a data series is
determined when the MIN command is followed by two arguments:
MIN minimum_value input_series
where the first argument is a CONSTANT or PARAMETER and the second is the
series you wish to evaluate.
A series containing observation-by-observation minimums is generated
when more than two arguments, all of which must be data series, follow the
MIN command, i.e.,
MIN output_series input_series_1 input_series_2 ...
The same restrictions as apply to the MAX function apply to MIN.
6.7 Modular Division
SORITEC Sampler performs modular division via the MOD command, which
has the following format:
MOD remainder dividend divisor
In mathematical notation, the formula used is:
remainder = dividend - (INT(dividend/divisor) * divisor)
where INT is the integer part of the quotient within parentheses.
The dividend and divisor must be of the same type and may be
CONSTANTs, PARAMETERs or data series with the resulting "remainder" being
the same type. Modular division is useful for generating sequences of
uniform random numbers in SORITEC Sampler.
48
6.8 Compute Moving Average
The moving average of a series is calculated by the MA command.
MA output_series input_series length
In the command line, "input_series" is the series to be averaged, "length"
is the length of the moving average, and "output_series" is the resulting
series. The argument, "length", may be a CONSTANT, PARAMETER, or a numeric
quantity. The first n observations of the output_series, equivalent to the
length of the moving average are treated as MISSING data.
6.9 Compute Moving Sum
The MSUM command compute the moving sum of a series.
MSUM output_series input_series length
Arguments in the command line have the same meaning as the MA command. The
first n observations of the output_series, equivalent to the length of the
moving sum, are treated as MISSING data.
6.10 Statistical Operations
Several statistical functions are available for analyzing and
manipulating data. They are described in the following sections.
6.10.1 Correlation Matrix Calculation
A correlation matrix for the variables in an argument list is
generated by the CORREL command. The format of the command is:
CORREL series_1 series_2 series_3 ...
Only observations active in the currently defined USE period are used in
correlation matrix calculations. While only the correlation matrix is
output to the terminal, the correlation matrix (COR), vector of means
(MEANS), vector of standard deviations (DEVS) and covariance matrix (COV)
are calculated by CORREL and stored as SORITEC internal variables. These
results may be accessed with a RECOVER command.
6.10.2 Covariance Matrix Calculation
The COVA command computes, stores and prints a covariance matrix for
the variables named as arguments in the command line. The format of the
command is:
COVA series_1 series_2 series_3 ...
Similar to the CORREL command, only observations associated with the
currently active USE period are used in calculations. The vector of means
(MEANS), vector of standard deviations (DEVS) and covariance matrix (COV)
49
are stored as SORITEC internal variables when the COVA command is executed,
and may be accessed by the RECOVER command.
6.10.3 Other Statistical Operations
Several specialized statistical operations are supported by SORITEC
Sampler to describe the properties of a time-series. All operations have a
standard format which consists of the command name, followed by the output
variable and the input series, i.e.,
COMMAND output_constant input_series
Statistics are calculated over the currently active USE period. The
statistical operations available in SORITEC Sampler and commands for
executing them are:
Command Description
------- -----------
MEAN mean input_series Arithmetic Mean
RMS root_mean_square input_series Root Mean Square
SUM sum input_series Arithmetic Sum
SSR sum_squared_resids input_series Sum of Squared
Residuals
50
Chapter 7
SORITEC Financial Functions
7.0 Financial Functions in SORITEC
SORITEC Sampler contains most of the common financial analysis
functions. These functions used alone or with SORITEC's forecasting com-
mands provide an extremely powerful tools for performing financial project
evaluation. The functions currently provided include internal rate of
return, present value, and various loan amortization schedules.
Note that in all SORITEC Sampler financial functions, interest rates
are treated as decimal quantities unless otherwise noted; specifically, 15%
is represented as 0.15.
7.1 Internal Rate of Return
The internal rate of return command calculates the internal rate of
return for an arbitrary series "X" via a modified Newton-Raphson search
algorithm. The format of the command is
IRR([CAPITAL=#,ITER=#,TOL=#,INITIALR=#]) &
interest_rate net_income_series
where "interest_rate" is a legal SORITEC constant name for the resulting
interest rate which discounts the "net_income_series" to a zero net present
value.
Alternatively, the IRR command can be used to calculate the internal
rate of return on the profits or benefits associated with a project with
known costs. In this situation, the form of the command is:
IRR([CAPITAL=#,ITER=#,TOL=#,INITIALR=#]) &
interest_rate benefits costs
Here, the second series is subtracted from the first in calculating the
IRR.
The optional modifiers in the command line allow the user to control
the parameters determining convergence for the algorithm as well as speci-
fication of an arbitrary start-up capital cost. Specifically,
CAPITAL is the start-up cost of the project. It is auto-
matically subtracted from the first period profits.
ITER is the maximum number of iterations for the search.
The default is 50.
51
TOL is the tolerance level that defines convergence. An
absolute or relative change in the net present value of
less than TOL results in convergence. The default value
is .00001.
INITIALR allows the user to specify a starting value for
the iterations. This is of special value in finding
multiple roots to the IRR equation when cash flows
change signs more than once during the life of the
project.
7.2 Present Value
The present value command, PV, calculates the net present value of a
stream of net benefits (or profits) associated with a financial venture.
PV will take either a scalar value for the interest rate or a time series
of forecast values. This later feature, when combined with the estimation
and forecasting capabilities of SORITEC Sampler, provides a powerful tool
for simulating and evaluating financial projects. The syntax of the com-
mand is:
PV([PERIOD=<D,W,T,M,Q,S,A>,<SIMPLE,COMPOUND>]) &
present_value net_income_stream <costs> interest_rate
where "present_value" is a scalar value equal to the present value of the
income stream, "net_income_stream" is the net income stream to be
discounted, and "interest_rate" is the interest rate used in calculating
the present value. The interest rate can be either a scalar, fixed for all
periods, or a time series of interest rates. This allows for easy
incorporation of interest rate forecasts into project evaluation.
The "net_income_stream" can be followed by an optional cost series.
This second argument in the command line can be either a single net income
stream or a pair of series describing the revenues and costs of the
project.
The optional modifiers in the command line allow the user to convert
the periodicity of the interest rate to conform to the net income stream
and to specify the type of conversion to be performed. Specifically,
PERIOD allows an interest rate conversion to be spec-
ified; specifically, setting PERIOD equal to one of the
options results in the specified interest rate being
converted from the selected periodicity to the period-
icity of the current USE period. The periodicity may be
(D)aily, (W)eekly, (T)en Day, (M)onthly, (Q)uarterly,
(S)emi-annual or (A)nnual.
A second option, specified either as SIMPLE or COMPOUND,
is the type of conversion to be used. The default is
COMPOUND conversion.
The PERIOD modifier used with the conversion option can handle trans-
formations between annual or effective interest rates and the effective
52
periodic percentage rates. If the annual rate is given as 15%, the effec-
tive annual percentage rate is 16.0754% - calculated as .15/12 = 1.25%
compounded monthly. For example,
PV(PERIOD=A,SIMPLE) pv_result PROFIT .15
will correctly convert the 15% annual percentage rate to a 1.25% monthly
rate before calculating the present value. If the available data are given
in terms of effective yields, the COMPOUND option should be used to
correctly convert rates between periods. A loan requiring 4% per quarter
is equivalent to a loan rate of 1.316% compounded monthly [exp(ln(1.04)/3)-
1]. Here, the appropriate command would be:
PV( PERIOD=Q, COMPOUND ) pv_result PROFIT .04
7.3 Loan Amortization
The loan amortization procedure (AMORT) provides a convenient
technique for calculating the monthly payment for a given loan situation.
In addition to the standard loan value and interest rate setup, AMORT also
supports an arbitrary number of loan payment series, balloon payments,
variable interest rates, as well as options for dynamically extending the
amount of the loan through additional borrowings. The format of the com-
mand is:
AMORT([PERIOD=<D,W,T,M,Q,S,A>,<SIMPLE,COMPOUND>], &
[RULEOF78],[BALLOON=#]) &
payment loan interest_rate [aux_pay_1 ... aux_pay_n]
where "payment" is the resulting per period payment to fully amortize the
loan during the current USE period, and "loan" is the amount of the loan.
The loan can either be a constant or a it can be a time-series if the loan
is allocated over the time period set in the USE command. "interest_rate"
is the interest rate of the loan. It must be the same type, either con-
stant or time-series, as the "loan".
The optional command line arguments, "aux_pay_i" are time-series of
auxiliary payments in addition to the monthly loan payment. These can be
used to enter payments to principal that are awkwardly or randomly timed.
For example, a loan which required balloon payments of $5000 every five
years can be handled as a time-series with value 5000 for every fifth year
and zeros elsewhere.
The optional modifiers in the command line allow the user to change
the amortization schedule as follows:
PERIOD is the same as for PV. It allows an interest
rate conversion to be specified; specifically, setting
PERIOD equal to one of the options results in the
specified interest rate being converted from the
selected periodicity to the periodicity of the current
USE period. The periodicity may be (D)aily, (W)eekly,
(T)en-Day, (M)onthly, (Q)uarterly, (S)emi-annual or
(A)nnual.
53
RULEOF78 constructs a principle and interest payment
series for the loan according the the "Rule of 78" (sum
of the months). This option is only valid for loans
with a single period of borrowing and a fixed interest
rate.
BALLOON allows the specification of a balloon payment in
the final period.
54
Chapter 8
SORITEC Sampler Cross-Section Techniques
8.0 Introduction
The full version of SORITEC contains most of the common techniques for
processing and analyzing cross-sectional data sets and, in addition to
providing access to most of the intermediate and final results, also imple-
ments several diagnostic tests not reported by most statistical packages.
The specific subset of techniques currently implemented in SORITEC Sampler
are as follows:
SYNOPSIS provides a quick statistical summary of a data series.
XTAB carries out a standard r * c contingency table analysis
including tests of independence.
8.1 Synopsis
The SYNOPSIS command returns a detailed summary analysis of a data
series including mean, standard deviation, median (including a 95%
confidence interval), mode, quartiles, deciles, variance, skewness, kurto-
sis, coefficient of variation, number of observations, number of missing
values, minimum, maximum, range, mode and the frequency of the mode. The
command format of SYNOPOSIS is:
SYNOPSIS var_1 var_2 ... var_n
In addition to outputting them to the terminal, SYNOPSIS stores the
summary statistics as SORITEC internal variables, which may be recovered
either explicitly with the RECOVER command or by implicit reference. See
the description of the RECOVER command in Section 2.7 to retrieve these
data. Except for DECILE AND QUARTILe statistics, internal variables asso-
ciated with the SYNOPSIS command are stored as vectors that have the same
number of elements as arguments in the SYNOPSIS command line. Recoverable
SORITEC internal variables stored as vectors are:
COUNT = number of non-missing observations for each variable
MEDIAN = median value for each variable
MIN = minimum values
MAX = maximum values
RANGE = range for each variable (max - min)
MEANS = mean values
VARS = variances for each variable
DEVS = standard deviations
CV = coefficient of variation for each variable
KURT = kurtosis of each variable
SKEW = skewness for each variable
MODE = mode values for each variable
55
Two other internal variables are stored upon execution of the SYNOPSIS
command. The variables are:
DECILE = decile values of a series
QUARTIL = quartile values of a series
Currently, the DECILE and QUARTIL internal variables are stored as vectors
meaning that decile and quartile values are stored for the last argument in
the command line, only.
Quantiles are defined as the first observations less than or equal to
the true mathematical quantiles (n/4 and n/10) in both cases.
Note that SYNOPSIS exercises casewise deletion of missing values on
each variable when it computes the summary statistics. Because of this, the
statistics may not compare with those from other SORITEC statistics com-
mands like STATS, KURTOSIS, etc.
8.2 Crosstabulation Analysis
The XTAB command calculates the standard r * c crosstabulation report.
The format of the command is:
XTAB series_1 series_2
The arguments "series_1" and "series_2" must be discrete data. If the
series you wish to crosstabulate are continuous, they must be converted via
the RECODE command. XTABs doesn't delete missing values, but instead,
reports them as a separate category "MISSING" in the appropriate row or
column.
In addition to printer-oriented output, XTABs has an interactive
screen display mode which allows scrolling through the table in a "spread-
sheet" mode. This feature is described in Chapter 10.
XTAB stores the following internal results. The full table is stored
only when the NOMATS option is OFF.
^NROW = number of distinct row values (variable #1)
^NCOL = number of distinct column values (variable #2)
^RMARGIN = a nrow x 1 vector containing the row margin values
^CMARGIN = a ncol x 1 vector containing the column values
^XTABLE = nrow by ncol matrix composing the inner table
56
Chapter 9
Estimation and Forecasting with SORITEC Sampler
9.0 Introduction
The SORITEC Sampler provides you with several single-equation estima-
tion techniques for both single equation and simultaneous equation models.
Both ordinary least squares (OLS) and two-stage least squares regression
estimators are available. In addition, both the Cochrane-Orcutt and
Hildreth-Lu autocorrelation techniques for the single-equation model are
supported by SORITEC Sampler. These procedures may be applied to either
time-series or cross-section data. However, the structure of the equations
in any model to be estimated must be linear. The fitted equations of all
linear models estimated by SORITEC Sampler can be recovered and forecast.
The standard output from a SORITEC estimation command consists of a
coefficient tableau and a summary tableau of regression diagnostics which
includes the number of observations, the standard error of the regression,
mean of the dependent variable, R squared, R Bar squared, Durbin-Watson, F
test of overall significance, the log-likelihood, and the Akiake and
Schwarz statistics for model selection. The user may have the estimator
generate additional diagnostics by setting one or more options with ON
commands, which must be executed before the regression command. Use of
these options is described in Chapter 2. SORITEC estimation procedures
support ON VCOV, ON STATS, ON CCOR, ON ANOVA, ON PLOT, ON RESIDUAL and ON
BETA commands. These options are associated with SORITEC's interactive
tableaus and are described in Chapter 10.
When the ON CRT option is invoked, all estimation commands described
in this chapter support the display in interactive tableaus of regression
diagnostics. These tableaus provide the user with a greater number of
regression diagnostics than are output by the estimation commands in their
default modes. Commands for invoking the interactive tableaus and descrip-
tions of their contents are detailed in the next chapter.
9.1 Ordinary Least Squares (OLS) Estimation
The ordinary least squares estimator is invoked by the REGRESS command
which has the following syntax.
REGRESS [(ORIGIN)] dep_var ind_var1 ind_var2 ... ind_varn
The dependent variable must be the first argument in the variable list,
with the independent variables following immediately as the second through
last arguments. The keyword ORIGIN is optional and, if specified, forces
SORITEC Sampler to estimate the equation without a constant term. Other-
wise, the constant term is supplied automatically, not by the user. If
ORIGIN is specified in the command line, it must be enclosed within paren-
57
theses. When the regression plane is forced through the origin, the
regression diagnostics are adjusted accordingly.
9.2 Autocorrelation Techniques for the Single Equation Model
Two estimation techniques are available for estimating single equation
models when the user believes that the error terms are not independent, but
that a disturbance in one period influences later disturbances. The
Cochrane-Orcutt (CORC) iterative technique and the Hildreth-Lu (HILU) scan-
ning technique estimate models assuming first order serial autocorrelation
of the disturbances.
When either autocorrelation technique is invoked, SORITEC Sampler
temporarily shortens the USE period by one observation at the beginning
of the sample and by one observation after every gap to calculate the
required transformed data. The USE command in force, therefore, should
include the observations which are lost in the transformation of variables.
The USE period is then restored to its original interval(s) after the
command is completed. Regression diagnostics are calculated from the
residuals of the regression on the transformed variables.
9.2.1 Cochrane-Orcutt Iterative Technique
The Cochrane-Orcutt estimator is invoked by the command:
CORC [(ORIGIN)] dep_var ind_var_1 ind_var_2 ... ind_var_n
Command syntax considerations are identical to those associated with the
REGRESS command described in the previous section.
9.2.2 Hildreth-Lu Scanning Technique
In addition to the dependent and independent variable lists, the HILU
command requires that the lower and upper limits to the value of rho and
its stepsize during the scanning process be initialized. These values are
entered by the user into the command line by a set of positional parameters
that are optional. The syntax of the HILU command is:
HILU [([ORIGIN] ROMIN ROMAX ROSTEP)] dep_var &
ind_var_1 ind_var_2 ... ind_var_n
where the dependent and independent variable lists are positioned similar
to the other regression commands. ROMIN is an optional positional parame-
ter that defines the lower limit of rho. Similarly, ROMAX specifies the
upper limit to rho. The stepsize of the scanning process is defined by the
third positional parameter, ROSTEP.
If omitted from the command line, these parameters assume default
values of 0.0, 1.0 and 0.1, respectively. The user can selectively
initialize these parameters by entering the wild card symbol * in positions
where default values are to be assumed and the desired numeric values in
the other positions. For example, the command:
58
HILU (* * .05) dep_var ind_var_1 ind_var_2 ... ind_var_n
initializes ROMIN and ROMAX to their default values of 0.0 and 0.1, respec-
tively, and sets ROSTEP to the user-selected value of 0.05. If positional
parameters are entered into the command line, they must be enclosed within
parentheses.
9.3 Two-Stage Least Squares (2SLS) Estimates
Consistent estimates for a single equation from a simultaneous equa-
tion system can be obtained by using a two-stage least squares (2SLS)
estimator. Unlike the other estimation commands in this chapter, the 2SLS
procedure requires the user to enter two commands to estimate an equation.
First, all exogenous variables must be identified in an the EXOGENOUS
statement, which has the form:
EXOGENOUS exog_var1 exog_var2 ... exog_varn
All arguments associated with this command are exogenous variable names.
The EXOGENOUS command must be specified before invoking the 2SLS estimator.
After execution, all later 2SLS commands use the same list of exogenous
variables until another EXOGENOUS command is entered.
Two-stage least squares estimation is invoked by the TWOSTAGE command
which has the form:
TWOSTAGE [(ORIGIN)] dep_var ind_var_1 ind_var_2 ... ind_var_n
All arguments plus the ORIGIN keyword in the command line have the same
interpretation as used in the REGRESS command. Two-stage least squares
commands that detect omitted or mis-specified exogenous variables generate
error messages until a valid EXOGENOUS command is executed.
9.4 Forecasting Single Equation Models
Any single-equation model that has been estimated by SORITEC Sampler
can be forecast using the fitted equation that is stored as a SORITEC
internal variable. To forecast an equation, all of the independent or
right-hand variables that were used to estimate it must be defined for the
period over which the forecast is to be made. These values may be
observed, projected, assumed or may be the product of other forecasts.
While forecasting results from the execution of a single command, a
series of commands must be executed to generate meaningful results.
(1) Estimate a single equation model using the REGRESS,
CORC, HILU or TWOSTAGE command.
(2) Change the active observation period to the forecast
period with the USE command.
59
(3) RECOVER the fitted equation from its internal system
name of FOREQ.
(4) Use the FORECAST command to forecast the fitted
equation over the desired time period.
The format of the FORECAST command is:
FORECAST fitted_equation_name
Since SORITEC internal system names may be referenced directly from
the FORECAST command, step (3) is optional. In this case, the fitted equa-
tion is forecast simply by entering:
FORECAST ^FOREQ
Use of the RECOVER command is necessary, however, if you want to FORECAST
the fitted equation after estimating other models since SORITEC replaces
^FOREQ each time an equation is estimated. Fitted equations can be
databanked like most other SORITEC items.
Forecasting single equation models in SORITEC Sampler is illustrated
in the example below.
USE 1975Q1 1982Q4
REGRESS gnp consumption investment(-1)
RECOVER gnp_equation FOREQ
USE 1983Q1 1984Q3
FORECAST gnp_equation
PRINT gnp
If the fitted equation is not need after being forecast, the command
sequence is:
USE 1975Q1 1982Q4
REGRESS gnp consumption investment(-1)
USE 1983Q1 1984Q3
FORECAST ^FOREQ
PRINT gnp
The FORECAST command executes only a static forecast. This means that
lagged independent variables are not automatically generated for each
successive period but instead must be supplied during the forecast. In
other words, the command sequence:
USE 1980Q1 1984Q4
REGRESS gnp gnp(-1)
USE 1985Q1 1985Q4
FORECAST ^FOREQ
is illegal and generates an error if there are no data for "gnp" beyond
1985Q1.
Note that the FORECAST command stores the forecasted values of the
dependent variable under the same name as the dependent variable previously
60
defined. This means that any existing values for the dependent variable
over the forecast period are replaced and cannot be retrieved. All
existing values for the dependent variable outside the forecast period are
retained, however, with the result that forecasted values are spliced into
the original series as though the REVISE command has been used. To
preserve existing values, the dependent variable series should first be
copied to another series name or databanked before forecasting the fitted
equation, e.g.,
USE 1975Q1 1982Q4
REGRESS gnp consumption investment(-1)
RECOVER gnp_equation FOREQ
USE 1983Q1 1984Q3
temp_gnp = gnp
FORECAST gnp_equation
PRINT gnp temp_gnp
As values for "temp_gnp" are MISSING prior to 1983Q1 (since the active USE
period was 1983Q1 to 1984Q3 when the transformation was executed), the
original series is recreated by the command sequence:
USE 1983Q1 1984Q3
REVISE gnp = temp_gnp
Alternatively, copy both estimation and forecast period observations to
temporary variables before forecasting an equation.
61
Chapter 10
SORITEC Interactive Print Server
10.0 Introduction
SORITEC Sampler allows complete control over the output presentation
for selected procedures. In REGRESS and CROSSTAB the user controls the
order and depth of the presentation of the results. REGRESS generates 10
separate output summaries which may be selected, or repeated, in any order
that you desire. CROSSTABS allows you to scroll through the crosstabs
table in a "spreadsheet" mode, or switch to the table of summary statis-
tics. In addition, a HELP menu is provided which describes each display
option.
The interactive regression display supports 10 different screen dis-
plays including 3 tables of residual summaries, a residual plot, the
covariance matrix of coefficients, the correlation matrix of coefficients,
extended regression reports (beta coefficient, partial r and elasticities),
a regression summary table, the ANOVA table for goodness of fit, means and
standard deviations of the independent variables and of course the regres-
sion estimates.
When the interactive mode is in effect, a selection menu appears on
the last line of the screen. Entering a ? will bring up a more detailed
help menu regarding the contents of each display. Selecting an invalid
choice sounds the "bell" and prompts you for another choice. There are
several additional special keystrokes, in addition to those in the selec-
tion menu, that control interactive display. Entering a carriage return, a
'+' or a space advances the display to the next tableau in the selection
menu. Entering a backspace returns you to the previously displayed tab-
leau. Entering a '-' displays the previous screen in the selection menu.
The interactive option is available for REGRESS, TSLS, CORC, and HILU.
10.1 Entering Interactive Mode
To enable the interactive mode you must turn on the option by entering
the command:
ON CRT
When this option is enabled, SORITEC Sampler automatically switches into an
interactive presentation whenever a command is executed that supports the
interactive tableaus.
To stop the interactive presentation, enter OFF CRT. SORITEC Sampler
will resume normal output presentation.
62
10.2 Tableau Descriptions
The following sections discuss each tableau and their associated menu
selection codes available with SORITEC estimation commands.
10.2.1 Coefficient Display (E)
Coefficient estimates are automatically displayed when the regression
equation is estimated. The presentation shows the technique, the current
sample period, coefficients, standard errors, t-values and the significance
levels of the t statistic.
10.2.2 Regression Summary Table (G)
The regression summary table provides a quick synopsis of the regres-
sion. The table reports the number of observations, mean of the dependent
variable, the log-likelihood ratio, Schwarz and Akaike criteria, R-squared
(adjusted), the standard error of the regression, Durbin-Watson and F-
statistics and the significance of the F-statistic. If the ORIGIN option
is specified, the statistics are adjusted appropriately.
10.2.3 Residual Autocorrelation Summary (R)
The residual summary table provides information on the distribution of
the residuals (mean, variance, skewness, kurtosis, minimum, maximum,
average absolute error, etc.) and the autocorrelation structure of the
residuals with Durbin-Watson ( for one, four and 12 periods) and the first
24 Box-Pierce statistics. All these statistics, along with the first 24
autocorrelation coefficients, may be recovered for later analysis.
10.2.4 PDF and Histogram of Standardized Residuals (H)
This table provides a quick summary of the distribution of the resi-
duals for quick identification of outliers or a skewed distribution, and
shows the percentage of residuals falling between each integer multiple of
the regression error variance, including a histogram of the same infor-
mation. The histogram information has a higher resolution than the table
since each line of the screen represents 1/3 of a standard deviation.
Because of this, scale may at times appear to be off somewhat; specifical-
ly, if the maximum table value is 40% the maximum vertical value on the
plot might be, say, 17%.
10.2.5 Non-Parametric Residual Distribution Tests (N)
This table provides a set of statistical tests on the normalcy of the
residual distribution as well as tests of the randomness of the residuals.
Specifically, SORITEC Sampler carries out a "Run of Signs" test for random-
ness, a chi-square test against the normal distribution, and a Kolmogorov
test for normality.
63
10.2.6 Regression ANOVA Table (A)
This is the standard ANOVA table showing the derivation of the F-
statistic reported in the summary table. Similar to the summary table, all
reported statistics are adjusted appropriately when the regression equation
is constrained through the origin. ON ANOVA will activate this output when
the OFF CRT flag, or non-interactive mode, is set.
10.2.7 Covariance Matrix of Coefficient Estimates (V)
This tableau displays a variance-covariance matrix of the coeffi-
cients. It is equivalent to the display produced by the ON VCOV option
when the OFF CRT option is set.
10.2.8 Correlation Matrix of Coefficient Estimates (C)
Although there is little theory regarding the correlation matrix of
coefficient estimates, it does provide a quick way to examine the relation-
ship between pairs of coefficients. ON CCOR will present this display in
when SORITEC Sampler is in OFF CRT mode.
10.2.9 Beta Coefficients, Elasticities and Partial R (B)
This tableau presents coefficient estimates and their associated Beta
coefficients, elasticities and partial correlation coefficients. ON BETA
enables this display when the OFF CRT option is set.
10.2.10 Statistical Summary of Exogenous Variables (S)
This table reports the mean and standard deviation of the independent
variables. When the OFF CRT option is set, this display is activated by ON
STATS.
10.2.11 Actual vs Fitted Plot and Standardized Residuals (P)
This display shows the actual versus fitted and standardized residuals
for the regression. The plot is produced in a form that is reproducable by
line printers unless your PC has an IBM color graphics compatible display.
In that case, the plots appear in 3-color medium resolution mode. ON PLOT
activates this output when the OFF CRT option is set.
64
10.3 Interactive Crosstabs
The XTAB command allows for interactive scrolling through the table in
a spreadsheet manner along with the option to present the summary statis-
tics for the current table. In this mode, keys are interpreted as follows:
(X) move down one screen, (S) move left one screen, (D) move right one
screen, (E) move up one screen, (T) to view the summary table of test
statistics, and (Q) to quit the crosstabs.
65
APPENDIX I
SORITEC INTERNAL SYSTEM NAMES
--------------------------------------------------------------------------
INTERNAL TYPE PRODUCED
SYSTEM OF BY
NAME ITEM COMMANDS* LENGTH DESCRIPTION
--------------------------------------------------------------------------
CCOR MATRIX (5) NV**2 CORRELATION MATRIX OF
COEFFICIENTS
COEF VECTOR (5) NV REGRESSION COEFFICIENTS
COR MATRIX CORREL NARGS**2 CORRELATION MATRIX
COV MATRIX COVAR, CORREL NARGS**2 COVARIANCE MATRIX
DEP ALPHANUMERIC (2),(3), 1 NAME OF DEPENDENT
ITEMS ALMON,REGRESS, VARIABLE
TWOSTAGE
DEVS VECTOR STATS,CORREL NARGS STANDARD DEVIATIONS OF
VARIABLES
DW CONSTANT (5) DURBIN-WATSON STATISTIC
FACTOR VARIABLE ADJUST NOBS SEASONAL FACTOR SERIES
FOREQ EQUATION REGRESS, N/A FITTED EQUATION FOR
TWOSTAGE FORECASTING
GAPS CONSTANT USE NUMBER OF GAPS IN
CURRENT USE COMMAND
ITERS CONSTANT (2),(3),(4) ITERATIONS USED IN
ARRIVING AT COEFFICIENTS
LAGCOi VECTOR ALMON NDEGi+1 LAG COEFFICIENTS ON iTH
DISTRIBUTED LAG VARIABLE
LAGSEi VECTOR ALMON NDEGi+1 STANDARD ERRORS OF LAG
COEFFICIENTS LAGCO(i)
LAGSUMi CONSTANT ALMON SUM OF LAG COEFFICIENTS
FOR iTH DISTRIBUTED LAG
VARIABLE
MEANS VECTOR STATS,CORREL NARGS MEANS OF VARIABLES
MLAGi CONSTANT ALMON MEAN LAG FOR iTH DISTRI-
BUTED LAG VARIABLE
NARGS CONSTANT COVAR,CORREL, NUMBER OF VARIABLES IN
STATS ARGUMENT LIST
66
APPENDIX I (cont'd)
SORITEC INTERNAL SYSTEM NAMES
--------------------------------------------------------------------------
INTERNAL TYPE PRODUCED
SYSTEM OF BY
NAME ITEM COMMANDS* LENGTH DESCRIPTION
--------------------------------------------------------------------------
NDEGi CONSTANT ALMON DEGREE OF iTH
DISTRIBUTED LAG VARIABLE
NEQ CONSTANT (4) NUMBER OF EQUATIONS
ESTIMATED
NGAPS CONSTANT (5) NUMBER OF GAPS IN USE
USED FOR LAST REGRESSION
NOBS CONSTANT (5) NUMBER OF OBSERVATIONS
USED IN LAST REGRESSION
NV CONSTANT REGRESS,(2),(3), NUMBER OF INDEPENDENT
TWOSTAGE RIGHT-HAND VARIABLES IN
LAST REGRESSION
NV CONSTANT (4), ALMON NUMBER OF COEFFICIENTS
OR PARAMETERS ESTIMATED
BY LAST (4) OR ALMON
COMMAND
OBS CONSTANT USE NUMBER OF OBSERVATIONS
IN CURRENT USE
RAWEQ EQUATION REGRESS, N/A USER'S ORIGINAL
TWOSTAGE UNFITTED EQUATION
REGSE CONSTANT (2),(3), ALMON STANDARD ERROR OF
REGRESS,TWOSTAGE REGRESSION
RHO CONSTANT (2) 1ST-ORDER AUTO-CORREL-
ATION COEFFICIENT
RHO VECTOR (3) 2 1ST-ORDER AND 2ND-ORDER
AUTO-CORRELATION
COEFFICIENTS
RSQ CONSTANT (5) R-SQUARED
RSQADJ CONSTANT (5) R-SQUARED ADJUSTED FOR
DEGREES OF FREEDOM
SE VECTOR (5) NV COEFFICIENT STANDARD
ERRORS
67
APPENDIX I (cont'd)
SORITEC INTERNAL SYSTEM NAMES
--------------------------------------------------------------------------
INTERNAL TYPE PRODUCED
SYSTEM OF BY
NAME ITEM COMMANDS* LENGTH DESCRIPTION
--------------------------------------------------------------------------
SSR CONSTANT ALMON,REGRESS SUM OF SQUARED
TWOSTAGE,(2),(3) RESIDUALS
VCOV MATRIX (5) NV**2 VARIANCE-COVARIANCE
MATRIX OF COEFFICIENTS
YFIT VARIABLE (5) NOBS FITTED VALUES
YMEAN CONSTANT ALMON,REGRESS MEAN OF DEPENDENT
TWOSTAGE,(2),(3) VARIABLE
--------------------------------------------------------------------------
*INTERNAL RESULTS ARE PRODUCED BY THE COMMANDS ASSOCIATED WITH THE FOLLOWING
NUMBERS:
(1) REGRESS, TWOSTAGE, MVR, THREESTAGE
(2) HILU, TSHILU, CORC, TSCORC
(3) HILU2, TSHILU2, CORC2, TSCORC2
(4) MVR, THREESTAGE, nonlinear REGRESS, nonlinear TWOSTAGE
(5) ALMON, (1), (2), (3)
NOTE: Not all commands are available in SORITEC Sampler.
68
APPENDIX II
GLOBAL OPTIONS AND DEFAULT SETTINGS IN SORITEC
----------------------------------------------
DEFAULT
OPTION SETTING DESCRIPTION
------ ------- -----------
ALIAS OFF The ALIAS option controls the printing of
variable names in output produced by SORITEC
commands invoked from a PROCEDURE. It is not
supported in SORITEC Sampler.
ANOVA OFF When the OFF CRT option is in effect, ON
ANOVA generates a standard ANOVA table
with SORITEC estimation results showing the
derivation of the F- statistic reported in
the summary table. It is otherwise generated
by the A-key in interactive mode.
BETA OFF When the OFF CRT option is in effect, ON BETA
generates the regression tableau that pre-
sents coefficient estimates and their
associated Beta coefficients, elasticities
and partial correlation coefficients. This
tableau is also generated by the B-key in
interactive mode.
BRIEF OFF Suppresses command number prompts in interac-
tive mode, as well as messages reminding the
user to close DO loops and procedures, and to
satisfy outstanding GO TO's.
CCOR OFF Correlation matrix of regression coefficients
is printed after every regression.
CRT OFF The CRT option is used with the PAGESIZE
command to control SORITEC output to the CRT
terminal. When the CRT option is ON,
SORITEC prints only PAGESIZE or fewer lines
of information before pausing. Entering a
carriage return resumes output. ON CRT also
enables the tableaus associated with
SORITEC's estimation and XTAB commands.
DETAIL OFF Not implemented at this release.
DIVZERO ON Not implemented at this release.
DOLLAR OFF When the DOLLAR flag is turned ON,
dollar signs in SORITEC input are inter-
preted as semicolons (statement separators).
Use of this feature is not recommended and
the flag will be removed in a future release.
69
APPENDIX II(cont'd)
GLOBAL OPTIONS AND DEFAULT SETTINGS IN SORITEC
----------------------------------------------
DEFAULT
OPTION SETTING DESCRIPTION
------ ------- -----------
DYNAMIC OFF Causes transformations involving lagged
variables to be performed dynamically instead
of statically.
ECHO OFF Echos input lines to output device.
GROUP OFF Enables automatic group expansion in
commands.
HEAD ON Prints standard headings on each page (batch
runs only).
JOURNAL OFF The JOURNAL flag controls writing of inte-
ractive input to the journal file. It is
set OFF when SORITEC begins execution and is
set ON when interactive processing mode is
invoked by the HELLO command.
LOG OFF Not implemented at this release.
MISSING ON Causes warning messages to print where the
user accesses observations which never have
been given a value.
NEGEXP OFF Not implemented at this release.
NEGLOG ON Not implemented at this release.
NOEJECT OFF Not implemented at this release.
NOERROR OFF Not implemented at this release.
NOMATS ON Saves workspace by suppressing storage of the
VCOV, CCOR, and RAWEQ internal results after
each regression.
PERFECT OFF Not implemented at this release.
PLOT OFF Plots actual versus fitted values of the
dependent variable after every regression.
The plot is generated in a form reproducable
by line printers unless your PC has an IBM
color graphics compatible display, in which
case it appears in 3-color medium resolution
mode.
70
APPENDIX II(cont'd)
GLOBAL OPTIONS AND DEFAULT SETTINGS IN SORITEC
----------------------------------------------
DEFAULT
OPTION SETTING DESCRIPTION
------ ------- -----------
PRINT OFF Controls printing of intermediate computa-
tional results
PROMPT OFF Not implemented at this release.
RAGGED OFF When enabled, the RAGGED option allows you to
assign fewer observations to a variable using
the FILL command than are associated with the
current USE period. Usually, an error
message is generated when this condition
exists. FILL assigns MISSING values to
observations beyond the end of shorter series
to the end of the USE period. ON RAGGED does
NOT permit the entry of more observations
than specified in the current USE period.
RAWEQ ON The RAWEQ option, when enabled, stores the
raw equation associated with any regression
estimated by SORITEC under the internal
variable name ^RAWEQ. Disabling the option
saves symbol table space, since several coef-
ficients are stored for each RAWEQ entry.
REPLACE OFF When REPLACE is turned ON, the databanking
KEEP command saves items on the currently
ACCESSed databank regardless of whether name
conflicts occur with items already stored
in the databank. In other words, KEEP acts
like a REPLACE command when this option is
enabled.
RESIDUAL OFF When the OFF CRT option is in effect, the
RESIDUAL global option generates three of the
tableaus associated with regression tableaus
in CRT mode. These are:
(1) the Residual Summary Table that provides
information on the distribution of the resi-
duals (mean, variance, skewness, kurtosis,
minimum, maximum, average absolute error,
etc.) and the autocorrelation structure of
the residuals with Durbin-Watson ( for one,
four and 12 periods) and the first 24 Box-
Pierce statistics.
(2) PDF and Histogram of Standardized Resi-
duals, providing a quick summary of the dis-
tribution of the residuals for quick identi-
71
APPENDIX II(cont'd)
GLOBAL OPTIONS AND DEFAULT SETTINGS IN SORITEC
----------------------------------------------
DEFAULT
OPTION SETTING DESCRIPTION
------ ------- -----------
fication of outliers or a skewed distribu-
tion. It also shows the percentage of resi-
duals falling between each integer multiple
of the regression error variance, including a
histogram of the same information.
(3) Non-Parametric Residual Distribution
Tests, providing a set of statistical tests
on the normalcy of the residual distribution
as well as tests of the randomness of the
residuals.
REVISE OFF Enables automatic splicing and updating of
time-series. With REVISE set ON, all
assignment and FILL statements behave as
though they are prefixed by a REVISE command.
This means that observations are added to
existing series if the current USE period is
outside the range of the USE period under
which the data series was last defined. If
the current USE period is a subset of the USE
period under which the symbol was last de-
fined, no truncation of the series occurs.
SMPL OFF Not implemented at this release.
STATS OFF Mean and standard deviation of all
independent variables in a regression.
STREAMIO OFF When enabled, this option allows formatted
READ commands to read successive observations
of a variable along a row, rather than down a
column, as normally expected.
TRAIL OFF When enabled, the TRAIL option generates a
debug trail for diagnosing SORITEC bugs.
UPRINT ON UPRINT controls the printing of underscores
(_) in variable names. When enabled, SORITEC
prints the underscores.
VCOV OFF Variance-covariance matrix of regression
coefficients is printed after every
regression.
72
APPENDIX III
QUICK REFERENCE LISTING OF SORITEC Sampler COMMANDS
ACCESS filename
ACCESS 'd:filename'
ACCESS '\directory1\directory2\filename'
AMORT([PERIOD=<D,W,T,M,Q,S,A>,<SIMPLE,COMPOUND>], &
[RULEOF78],[BALLOON=#]) &
payment loan interest_rate [aux_pay_1 ... aux_pay_n]
COMPUTE equation_name
[COMPUTE] transformation_expression
CONSTANT const_1 [value_1] const_2 [value_2] ...
CONTENTS [filename]
CONTENTS 'd:filename'
CONTENTS '\directory1\directory2\file
Statement_number CONTINUE
CONVERT [(modifier)] input_series
CONVERT [(modifier)] output_series = input_series
COPY item_1 item_2 ... item_n
CORC [(ORIGIN)] dep_var ind_var_1 ind_var_2 ... ind_var_n
CORREL series_1 series_2 series_3 ...
COVA series_1 series_2 series_3 ...
CREATE filename
CREATE 'd:filename'
CREATE '\directory1\directory2\filename'
DISCARD item_1 item_2 ... item_n
DO index = beginning_value TO end_value BY increment
END
DOT variable_1 variable_2 ... variable_n
ENDDOT
DUMMY output_series first_observation skip_increment
END
ENDDOT
EQUATION equation_name [equation]
EXECUTE filename
EXECUTE 'd:filename'
EXECUTE '\path\filename'
EXOGENOUS exog_var_1 exog_var_2 ... exog_var_n
FILL variable_name value_list
FLAGS flag_vector
FORECAST fitted_equation_name
FORECAST ^FOREQ
Statement_number FORMAT format_specification
*FORGET [item_name]
-----------------------------
* denotes commands that accept wildcard characters in arguments.
73
APPENDIX III(cont'd)
QUICK REFERENCE LISTING OF SORITEC Sampler COMMANDS
GO TO statement_number (also GOTO)
*GROUP group_name name_1 name_2 ... name_n
HELLO
HILU [([ORIGIN] ROMIN ROMAX ROSTEP)] dep_var ind_var_1 &
ind_var_2 ... ind_var_n
IF condition; THEN; command_sequence_1; ELSE; command_sequence_2
IMPUTE [ZERO|MEAN|INTER|TREND|NONE]
IRR([CAPITAL=#,ITER=#,TOL=#,INITIALR=#]) &
interest_rate net_income_series
IRR([CAPITAL=#,ITER=#,TOL=#,INITIALR=#]) interest_rate benefits costs
JOB job_label
KEEP item_1 item_2 ... item_n
KEEP(ACTIVE) item_1 item_2 ... item_n
MA output_series input_series length
MAX maximum_value input_series
MAX output_series input_series_1 input_series_2 ...
MAXERR number
MEAN mean input_series
MIN minimum_value input_series
MIN output_series input_series_1 input_series_2 ...
MISSING constant_name
MOD remainder dividend divisor
MSUM output_series input_series length
OFFLIST
ONLIST
PARAMETER param_1 [value_1] param_2 [value_2] ...
PLOT series_1 symbol_1 series_2 symbol_2 ...
PRINT arg_1 arg_2 arg_3 ...
PUNCH series_1 series_2 ...
PUNCHDIF[(filename)] arg_1 arg_2 arg_3 ...
PUNCHDIF('[d:][\path\]filename') arg_1 arg_2 arg_3 ...
PURGE filename
PURGE '[d:][\path\]filename'
PV([PERIOD=<D,W,T,M,Q,S,A>,<SIMPLE,COMPOUND>]) ...
present_value net_income_stream <costs> interest_rate
QUIT
-----------------------------
* denotes commands that accept wildcard characters in arguments.
74
APPENDIX III(cont'd)
QUICK REFERENCE LISTING OF SORITEC Sampler COMMANDS
READ(filename)
READ('[d:][\path\]filename')
READ([filename] [statement_number]) series_1 series_2 ...
READ(['[d:][\path\]filename'] [statement_number]) &
series_1 series_2 ...
READDIF(filename)
READDIF('[d:][\path\]filename')
READDIF(filename) series_1 series_2 ...
READDIF([filename] statement_number) series_1 series_2 ...
RECODE output_series input_series p(1) p(2) p(3) p(4) ...
RECOVER [new_name] internal_name
REGRESS [(ORIGIN)] dep_var ind_var_1 ind_var_2 ... ind_var_n
RENAME new_name_1 old_name_1 new_name_2 old_name_2 ...
REPLACE item_1 item_2 ... item_n
RETURN
REVISE transformation_expression
RMS root_mean_square input_series
SCAN number
SCATTER series_1 series_2
SSR sum_squared_resids input_series
SUM sum input_series
SWITCH item_1 item_2
SYNOPSIS var_1 var_2 ... var_n
*SYMBOLS [ALL]
TIME [series_name]
TITLE [label]
TWOSTAGE [(ORIGIN)] dep_var ind_var_1 ind_var_2 ... ind_var_n
USE [begin_1] [end_1] [begin_2] [end_2] ...
USEIF expression
VECTOR vector_name value_1 value_2 ...
WIDTH number
WRITE([filename] [statement_number]) var_1 var_2 ...
WRITE(['[d:][\path\][filename]'] [statement_number]) &
var_1 var_2 ...
WRITE([filename] [statement_number]) constant_1 (time_series_1 &
time_series_2) constant_2
WRITE(['[d:][\path\][filename]'] [statement_number]) &
constant_1 (time_series_1 &
time_series_2) constant_2
XTAB series_1 series_2
-----------------------------
* denotes commands that accept wildcard characters in arguments.
75
APPENDIX IV
DETAILED FEATURE LIST FOR SORITEC VERSION 1.06B
1. REGRESSION TECHNIQUES
Ordinary Least Squares Regression Advanced Single Equation Techniques
Linear Ridge regression #
Non-linear # With arbitrary diagonal matrix
First-order Cochrane- or canonical scaling #
Orcutt or Hildreth-Lu
Second order C-O or H-L # Generalized least squares #
Fast regression using the GLS with C-O #
Cholesky Decomposition Restricted least squares #
ARMA residuals # Theil-Goldberger mixed
GLS autocorrelation estimation #
estimation * Principal components analysis
Minimax parameter estimation
Stepwise Regression *
Forward or backward methods Probit analysis #
CP statistics Discriminant analysis * #
Multiple levels for the
inclusion of variables F Test of linear hypothesis
F Test of non-linear
Exponential Smoothing Techniques hypothesis #
Single exponential, Brown's Calculate confidence intervals
linear & quadratic, Holt's for non-linear functions of
linear, adaptive response, coefficients #
Winter's linear and
seasonal Regression Diagnostics
Standard errors and t-values
Linear trend, S-curve and Sum of residuals
exponential growth forecasting Sum of squared residuals
Mean absolute residual
Two-Stage Least Squares Significance of t values
Linear Beta coefficients
Non-linear # Partial R values
First order C-O or H-L F statistic and significance
Second order C-O or H-L # Residual analysis
Fast two-stage using the Durbin-Watson 1st, 4th and
Cholesky Decomposition 12th order
Skewness and kurtosis
Distributed Lag Models First 24 auto-correlation
Almon coefficients and Box-Pierce
Shiller Q statistics
Ability to recover and forecast ANOVA table for regression
with the unscrambled equation Elasticities of the
Almon with C-O or H-L coefficients
76
Distribution tests on Residuals 3. FORECASTING AND SIMULATION
Percentage distribution of
residuals between -3 to +3 Single Equation Forecasting
standard deviations Static forecast
Dynamic forecasting
Procedures allow for regression Residual feedback
through the origin and adjust Non-linear forecasts
the test statistics appropriately
Statistics adjusted correctly for Multiple Equation Forecasting
gaps in sample period Static simulation #
Significance levels for all test Dynamic simulation #
statistics Non-linear equations allowed #
Conditional expressions in
Interactive, table-oriented, output equations allowed #
display for easy review of Simultaneous equation
regression results capability #
2. SYSTEMS ESTIMATION TECHNIQUES Solution of simultaneous non-linear
equations #
Zellner's Seemingly Unrelated Automatic block-decomposition of
Regression simultaneous models #
Linear and non-linear # Successive over- and under-
Iterative refinement of relaxation user-selectible #
residual correlations (IRRC) Easy comparison of scenarios #
optional # User control of convergence criteria
Three-Stage Least Squares and values #
Linear and non-linear #
With IRRC #
Full Information Maximum Likelihood 4. FINANCIAL AND ECONOMIC MEASURES
Linear and non-linear #
User selection of optimization Present value
method, stepsize algorithm, Internal rate of return
and convergence criteria #
Depreciation
Box-Jenkins Analysis Straight line, double-declining
Autocovariance balance, sum of years digits,
Autocorrelation ACRS and ADR schedules
Partial autocorrelation and Loan amortization
confidence intervals Peak to peak interpolation
ARMA (p,q), and ARIMA (p,d,q) Capital stock accumulation #
ARIMA with seasonal Capital utilization #
differencing Net capital investment #
Multivariate distributed lags Capital stock calculation #
with ARMA errors # Calculation of economic capacity #
Multivariate transfer functions Calculation of price indices #
Common rational coefficients
models # 5. CROSS-SECTIONAL AND SURVEY
Linearized form models # TECHNIQUES
Gaps allowed in lag structure #
Selection of holdout or Casewise deletion of missing values
backcasting # Frequency distributions
Arbitrary initial errors Histograms
allowed # Synopsis command
Multiplicative form models # T-Tests of grouped or paired data
77
Analysis of Variance RECODE function to convert data
ONEWAY and TWOWAY continuous ranges into discrete
Any combination of fixed or indicators
random factors
Covariates allowed Convert periodicities between
Unequal number of observations annual, monthly, quarterly,
allowed weekly, daily and undated data
Diagnostic testing included types (* for some combinations)
Automatic determination of the
appropriate analysis, i.e., Subscript ranges allowed in leads
1, or 2-way, with/without and lags, e.g., X(-1 TO -6)
interaction terms expands to X(-1) X(-2) ...X(-6)
Frequency and histogram options throughout the command syntax
Replications supported
Basic Statistics
Crosstabulation tables Mean, standard deviation, mode,
Nesting for multi-dimensional median, variance, skewness,
tables kurtosis, range, deciles,
Full set of test statistics quartiles, coefficient of
Interactive "spreadsheet" mode variation, root mean square,
for reviewing output correlation, covariance
analysis, Z scores, minimum,
Breakdown Analysis maximum, casewise deletion of
Nested breakdowns missing values
Histograms
ANOVA testing Normalization of time series
Seasonal dummy creation
Non-Parametric Statistics
Wilcoxon W+, signed rank test, Splicing function to merge two
run of signs test, Mann- versions of the series into one
Whitney U test, Spearman continuous series; including
correlation, Kendall tau simple splice, sliding weights, or
regression with sliding weights *
Rank function for construction of
other non- parametric tests, Analysis of Goodness of Fit
e.g., non-parametric ANOVA, etc. Runs test, chi squared
normality tests, Box-Pierce Q
Recovery of all intermediate results statistics, frequency
for cross-sectional procedures distribution of residuals,
Most procedures support dynamic Durbin-Watson 1st, 4th and 12th
recoding of continuous data to order
discrete categories
Most procedures support selection of DIF transformation to apply the nth
a subset of discrete values for an difference operator to a series k
analysis times. #
Random Number Generators
Beta, chi-squared, exponential,
6. TIME SERIES UTILITIES AND double exponential, F,
OPERATIONS geometric, normal, Poisson, t,
uniform
Weighted/moving averages and sums
Time series filter
78
Cumulative Density Functions 8. DATABANKING CAPABILITIES
Normal, t, F, beta *, gamma *,
chi-squared *, run of signs Maximum number of items in a data-
CDF base limited only by disk space.
Sorted contents listing
Seasonal Adjustment Techniques Databank can store data series,
Ratio to moving average for equations, vectors, matrices,
monthly, quarterly, or and linked models
arbitrary periodicity Simple one word database commands to
Census X-11 * # create, access, update, copy,
rename, switch, replace, list or
7. MATHEMATICAL FUNCTIONS AND discard database items.
OPERATIONS Database usage identical across
mainframe, minicomputer and
Algebraic entry of transformations microcomputer versions #
Logical operators supported
Modular arithmetic function
Sine, cosine, tangent, arc sine, arc 9. PROGRAMMING LANGUAGE
cosine, arc tangent, log, log10,
sinh, cosh, tanh, arc sinh, arc Structured Programming Language
cosh, and arc tanh functions Features
Ceiling, floor, round, sign, abs, User-defined procedures,
random and inverse normal PDF labeled/ numbered statements,
functions available global variables, local
variables, recursion allowed,
Substitution of missing values using GOTO, IF/THEN/ELSE, DO loops,
zero, mean, interpolation or DOT loops (over alpha index),
linear trend forecast values subscripted references allowed,
external command files allowed
Missing values propagate as missing
in all math operations; Equations and transformations
0*MISSING propagates as 0 specified in algebraic form
Logical operators can be specified Wildcards allowed in most commands
in algebraic form, e.g., >, >=, <, Variable subscript references, e.g.,
<=, etc. X(K) (except in equations)
Mixed logical and arithmetic
operators allowed in expressions Lags can be specified as negative
subscripts, e.g. X(-1) is the
TSP-like matrix commands first lag
Add or subtract two matrices,
transpose a matrix, matrix Access to intermediate and final
orthogonalization, triangular results using a keyword RECOVER
matrix inversion, matrix command, or by item name e.g.:
factorization, move vector to a RECOVER YFIT,
diagonal matrix, extract or RESID = Y-^YFIT
diagonal elements to a vector
Namelist capability using GROUP
Full algebraic matrix mathematics command
e.g., B=INV(TR(X)*X)*TR(X)*Y,
allows easy construction of Subscripted references to namelist
complex estimators # elements allowed, e.g., if GROUP
GRP1 contains X1 X2 X3 X4, then
GRP1(3) is X3
79
LEGAL function allows the user to 12. GRAPHICS
test for missing values and
develop custom missing value Printer graphics and plots
handling routines, e.g., casewise,
mean substitution, etc. Medium resolution screen-oriented
graphics
10. DATA ENTRY DIF I/O bridge to presentation-
quality graphics programs
Free-field data or FORTRAN formatted
entry from disk or keyboard
13. GENERAL FEATURES
DIF file I/O capabilities
TROLL print format input * Batch and interactive modes
DBase II I/O supported * available
Can be interfaced with mainframe Item names may be thirty-two
databases, e.g., Citibase, characters long
Predicasts, IMF, OECD, etc.
Equations may be recovered and
Custom database interfaces and printed
conversions to IBM PC/XT format
available on a contract basis Full function command line editor
allows the user to edit and rerun
Commercial databases available on one or more previous commands
diskettes for the PC and other
non-mainframes (e.g., Citibase, User access to differentiation
etc.) routine
Input and Output journaling
Data can be downloaded in SORITEC SORT command
Alternate Load (.SAL) file Global control over plots,
format from major data vendors statistics, etc.
(DRI, WEFA, CITIBASE Connection)
14. PC VERSION SPECIFICS
11. REPORT-WRITING CAPABILITIES
User may exit to the operating
Simplified report layout with system, run other programs and
complete user control of format, return to SORITEC session without
titles, contents, footnotes, losing any work
labels and currency symbols #
DOS commands can be executed inside
Automatic row/column subtotals, SORITEC, allowing editors,
grand totals, averages, communications programs, etc. to
products, differences, ratios be used in SORITEC procedures
and percentages #
Supports DOS redirection and use of
Automatic footnoting # fully qualified file names for
Store and recall report formats # access to subdirectories
____________________________________
Complex reports generated by a # Indicates features available only
single command # in full SORITEC. All other
features are in SORITEC.
Specification of asterisks or blanks
for small or missing values # * Available second quarter 1985.
80
Random Access Memory
Required Recommended
SORITEC 512K 640K
8087 high-speed math chip required for SORITEC Version 1.06B.
Number of Diskettes: SORITEC - 5 (1.7 Megabytes)
81
INDEX
A
ACCESS ................................. 37
Actual versus fitted.................... 64
Alpha Looping........................... 43
AMORT................................... 53
ANOVA table............................. 64
Arithmetic Mean......................... 50
Arithmetic Sum.......................... 50
Autocorrelation techniques.............. 58
B
Batch Processing........................ 10
Beta coefficients....................... 64
C
Cochrane-Orcutt ........................ 58
COMPUTE ................................ 14,16
Compute Moving Average.................. 49
Compute Moving Sum...................... 49
Conditional branching................... 43
CONSTANT................................ 13
Constants............................... 13
CONTENTS................................ 40
CONTINUE................................ 43
CONVERT................................. 46,47
Converting time-series from one
periodicity to another......... 45,46
COPY ................................... 38
CORC ................................... 58
CORREL ................................. 49
Correlation matrix...................... 49,64
Correlation Matrix Calculation.......... 49
COVA ................................... 49
Covariance matrix....................... 49
Covariance Matrix Calculation........... 49
CREATE.................................. 37
Cross-sectional data.................... 55
Crosstabulation Analysis................ 56
D
Data Interchange Format (DIF) Files..... 28
Data types.............................. 15
Databanks............................... 37
DIF File Input.......................... 28
DIF File Output......................... 30
DISCARD ................................ 40
Distribution of the residuals........... 63
DO ..................................... 41
DOT..................................... 43,44
82
DUMMY .................................. 45
Dummy variables......................... 45
E
Elasticities............................ 64
END..................................... 10,41
ENDDOT.................................. 44
EQUATION ............................... 14
Equations............................... 14
EXECUTE ................................ 11
Executing SAC Files..................... 10
EXOGENOUS .............................. 59
Exporting data.......................... 26
F
FILL.................................... 19,33
Financial functions..................... 51
Fitted equation......................... 59
FLAGS................................... 22
FORECAST ............................... 60
Forecasting single equation models...... 59
FOREQ................................... 60
FORGET.................................. 24
FORMAT.................................. 31
Formatted input and output.............. 31
FORTRAN formatted input................. 31
FORTRAN formatted output................ 32
G
Global options.......................... 22
GO TO (GOTO)............................ 42
Graphical Display....................... 34
Group expansion......................... 14
GROUP .................................. 14
H
HELLO................................... 9
Hildreth-Lu............................. 58
HILU.................................... 58
I
IF/THEN/ELSE............................ 43
Illegal transformations................. 17
Imputation of Missing Values............ 21
Importing data.......................... 26
IMPUTE ................................. 21
Input Journal Files..................... 11
Interactive mode........................ 62
Interactive Processing.................. 9
Interactive regression display.......... 62
83
Invoking SORITEC Sampler................ 9
Internal rate of return................. 51
IRR..................................... 51
J
JOB..................................... 10
K
KEEP ................................... 39
Keyboard Entry.......................... 33
L
LEGAL................................... 20
Line printer-style graphics............. 34
Loan amortization....................... 53
M
MA ..................................... 49
Mathematical functions.................. 16
Matrix.................................. 13
MAX..................................... 47,48
MAXERR ................................. 25
Maximum error limit..................... 25
Maximum Function........................ 47
Maximum value of a series............... 47
Mean and standard deviation of
the independent variables......... 64
MEAN ................................... 50
MIN .................................... 48
Minimum Function........................ 48
Minimum value of a data series.......... 48
MISSING................................. 19,20
Missing Data Handling................... 19
Missing Value Symbol Declaration........ 20
Missing Value Logical Function.......... 20
MOD .................................... 48
Modifiers, in the CONVERT command....... 47
Modular division........................ 45,48
Moving average.......................... 49
Moving sum.............................. 49
MSUM ................................... 49
N
Namelist................................ 14
Net present value....................... 52
Non-linear estimation................... 14
Null (Continuation) Statement........... 43
Numeric Looping......................... 41
84
O
OFFLIST................................. 25
ON ANOVA................................ 64
ON BETA................................. 64
ON CCOR................................. 64
ON CRT ................................. 57
ON GROUP................................ 14
ON PLOT................................. 64
ON REVISE............................... 19
ON STATS................................ 64
ON VCOV................................. 64
ONLIST.................................. 25
Options................................. 22
Ordinary least squares.................. 57
ORIGIN.................................. 57
Output of Data to the Terminal.......... 34
P
PARAMETER .............................. 13
Parameters.............................. 13
Partial correlation coefficients........ 64
Periodic dummy variable................. 45
PLOT ................................... 34
Prefix.................................. 44
Present value........................... 52
PRINT .................................. 34
PROCEDURE............................... 41
Programming language.................... 41
PUNCH .................................. 27
PUNCHDIF................................ 30
PURGE .................................. 38
PV ..................................... 52
Q
QUIT.................................... 10
R
READ.................................... 27,31,32
READDIF................................. 28
Recode a Variable....................... 46
RECODE.................................. 46
RECOVER................................. 22
REGRESS ................................ 57
Regression summary table................ 63
RENAME ................................. 39
REPLACE................................. 39
Residual summary table.................. 63
RETURN.................................. 38
REVISE.................................. 18
Revising Data .......................... 18
85
RMS .................................... 50
Root Mean Square........................ 50
S
SAL files............................... 26
SAL File Input.......................... 27
SAL File Output......................... 27
SCAN.................................... 25
SCATTER ................................ 36
Seasonal Dummies........................ 45
Selection menu.......................... 62
Serial autocorrelation.................. 58
Series of minimum values................ 48
Single-equation estimation techniques... 57
SORITEC................................. 6
SORITEC DataBank Files.................. 36,37
Special Symbols......................... 12
SSR .................................... 50
Standardized residuals.................. 64
Statistical Operations.................. 49,50
Sum of Squared Residuals................ 50
SUM .................................... 50
SWITCH ................................. 40
Symbol table............................ 23
SYMBOLS................................. 23
SYNOPSIS................................ 55
T
Tabular Display......................... 34
Time trend dummy series................. 45
TIME ................................... 45
Time-series variables................... 13
TITLE .................................. 25
Transformations......................... 16
Transforming continuous into
discrete variables................. 45
Two-stage least squares (2SLS) ......... 59
TWOSTAGE ............................... 59
U
Unconditional Branching................. 42
Uniform random numbers.................. 48
USE..................................... 15
USEIF .................................. 15
V
Variable Names.......................... 12
Variable Types.......................... 13
Variance-covariance matrix.............. 64
Vector.................................. 13
86
VECTOR ................................. 13
W
WIDTH .................................. 24
Wildcards............................... 21
WRITE................................... 32,33
X
XTAB ................................... 56
87
SORITEC INFORMATION REQUEST FORM
Yes, I'd like to receive more information about Sorites Group's Econometric
software products.
( ) Please send me information about SORITEC Version
1.06B.
( ) Please enter my name on SGI's mailing list to
receive information about new SORITEC releases.
( ) Send me the SORITEC Reference Manual. Enclosed is
(U.S.)$25.00 to cover the cost of the manual and
shipping.
( ) Send me the latest release of SORITEC Sampler,
including a bound copy of the SORITEC Sampler
Reference Manual and a copy of the SORITEC
Reference Manual. Enclosed is (U.S.)$50.00 to
cover the cost of materials and shipping.
Please print or type your name and address in the space below:
Name: _____________________________________
Affiliation: ______________________________
Address: __________________________________
___________________________________________
City:________________State: _______________
Country: ____________Postal Code: _________
Organizational affiliation: ( ) Commercial
( ) Government
( ) Academic
( ) Other ______________________
What type of computer do you own or use? ______________________
How many computers are at your address? _______
Complete and Mail to: The Sorites Group, Inc.
P.O. Box 2939
Springfield, VA 22152
88